SlideShare a Scribd company logo
1 of 36
“Used Car E-Commerce Stats
for Buyers, Sellers & Advertisers”
Prepared by,
Rohit G. Vaze
rvaze@hawk.iit.edu
Group number: 175
Introduction
• High increase in used (2nd hand) car sales in recent years
• Used cars: Fairly good condition, low cost
• E-commerce websites have triggered used car sales
• Popular E-commerce website for selling used cars: eBay
• Which are the popular brands in the used car market?
• Which are the popular vehicle types in the used car market?
• The selling price (ad price) of a used car depends on what factors?
• How will the used car prices be in the future?
Data set
Used car data set (header and 20,000 rows)
Checking for the missing values
No missing values
Top selling vehicle types in the used car market
Sedans = 9,416
SUVs = 6,542
Boxplot
Median
Sedan = $3,999
SUV = $4,500
Mean
Sedan = $5,196.55
SUV = 6,116.27
Hypothesis testing
Interpretation
As, (p-value = 1) > (alpha = 0.05)
We accept the null hypothesis
Hence, we can say
Mean selling price of sedans in the used car market is less than the mean selling price of SUV in the
used car market
Top selling brands in the used car market
Toyota = 7,919 units
Honda = 4,488 units
Boxplot
Median
Toyota = $5,000
Honda = $3,150
Mean
Toyota = $6,961.88
Honda = $3,997.25
Hypothesis testing
Interpretation
As, (p-value < 2.2e-16) < (alpha = 0.05)
We reject the null hypothesis
Hence, we can say
Mean selling price of Toyota cars in the used car market is higher than the mean selling price of
Honda cars in the used car market
Multiple linear regression
• We will check linear association
between variables
• Divide data into training and
testing data sets
• We will eliminate the insignificant
variables and determine the
variables that significantly affect
the dependent variable “price”
• Residual analysis
Correlation plot
Values of the correlations
High correlation
powerPS, price = 0.4812459
gearbox, price = 0.2236955
vehicletype, price = 0.119897
fueltype, price = 0.1153594
Eliminating the insignificant independent variables
Significant independent variables
Significant independent variables
• vehicletype
• gearbox
• powerPS
• kilometer
• fueltype
• brand
• postalcode
Using the function: step(full, direction=“backward”,trace=T)
Expression of the regression model:
Y =
5306.79366 + 381.39684 * vehicletype + 461.67136 * gearbox + 50.24259*powerPS –
0.06181*kilometer + 1015.12720 * fueltype – 205.69740*brand + 0.01067*postalcode
VIF values
vehicletype =
1.028207
gearbox = 1.166104
powerPS = 1.190127
kilometer = 1.019822
fueltype = 1.060202
brand = 1.081085
postalcode = 1.008306
VIF < 5,
No collinearity
problem
Checking the normality of the given data
Predicted vs Residual Plot
Since, majority data points
are concentrated around
the regression line, we can
say that the data is
normally distributed
Predicted vs Residual Plot
As majority data points lie
on the regression line we
can say that the data is
normally distributed
Residual analysis and predictions
Choosing the best model
with least RMSE value
ANOVA (for the independent variable ‘vehicletype’)
Null hypothesis:
Group means price of all cars with different vehicletypes are equal
Alternative hypothesis:
Group means price of all cars with different vehicletypes are not
equal
Interpretation:
(p-value < 2e-16) < (alpha = 0.05)
Hence, we reject null hypothesis
Hence, we can say that group means of all cars with different
vehicletypes are not equal
ANOVA (for the independent variable ‘brand’)
Null hypothesis:
Group means price of all cars with different brands are equal
Alternative hypothesis:
Group means price of all cars with different brands are not equal
Interpretation:
(p-value < 2e-16) < (alpha = 0.05)
Hence, we reject null hypothesis
Hence, we can say that group means of all cars with different brands
are not equal
ANOVA (for the independent variable ‘fueltype’)
Null hypothesis:
Group means price of all cars with different fueltypes are equal
Alternative hypothesis:
Group means price of all cars with different fueltypes are not equal
Interpretation:
(p-value < 2e-16) < (alpha = 0.05)
Hence, we reject null hypothesis
Hence, we can say that group means of all cars with different
fueltypes are not equal
Time series analysis and forecasting
• Loading the data set and required libraries
• Normality test: Histogram, QQ plot, Jarque-Bera test
• Ljung box test, ACF plots, PACF plots
• Differencing
• Build different models AR, MA, ARMA, ARIMA
• Future predictions
Jarque-bera test and histogram and QQ plot before differencing
As,
(p-value < 2.2e-16) < (alpha = 0.05),
we can state that the distribution is
normal
Time series plots
Mean and variance look constant with time: Stationary series
Normality test on differenced time series object
Majority points lie on the line. Hence, we can say
that the distribution is normal
Checking for serial correlation
Null hypothesis:
Series is not correlated and autocorrelations of time series object is
zero
Alternative hypothesis:
Series is correlated
As,
p-value < (alpha = 0.05)
We reject the null hypothesis
Hence,
We can say that serial correlation exists
Selecting the best model based on AIC value
Model using EACF:
AIC = 981.97
Selecting the best model based on AIC value
Model using AR:
AIC = 1124.99
Model using MA:
AIC = 979.32
Model MA is the best model as the AIC value is the lowest in its case
Plots
Residual analysis for MA model
Residual analysis for MA model
Ljung-box test result states that the residuals is independent
(close to white noise series)
Our model is adequate
Predictions for the future
Fuel type preferred
Buyers of the used cars trust
Petrol cars
(11,989 out of 20,000)
Diesel cars
(7,705 out of 20,000)
Electric cars
(249 out of 20,000)
Conclusion
• Top 3 popular vehicle types in the used car market: 1. Sedan 2. SUV 3. Cabriolet
• Top 5 popular brands in the used car market: 1. Toyota 2. Honda 3. Ford 4. Mercedes-Benz 5.
BMW
• Mean selling price of the sedans in the used car market is less than the mean selling price of SUVs
in the used car market
• Mean selling price of Toyota cars in the used car market is higher than the mean selling price of
Honda cars in the used car market. Despite that Toyota cars sell more than the Honda cars (in the
used car market). Hence, we can say that Toyota cars are more reliable, better built
• Power of a car, the type of gearbox, the fuel type are important factors that influence the selling
price of a car in the used car market
• Petrol cars sell the most (60%) in the used car market followed by diesel cars
• Forecast suggests that the selling price of the used cars will fairly be in the same range as of now
in the future
THANK YOU

More Related Content

What's hot

Understanding big data and data analytics big data
Understanding big data and data analytics big dataUnderstanding big data and data analytics big data
Understanding big data and data analytics big dataSeta Wicaksana
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment AnalysisRebecca Williams
 
Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Venkata Reddy Konasani
 
1. Data Analytics-introduction
1. Data Analytics-introduction1. Data Analytics-introduction
1. Data Analytics-introductionkrishna singh
 
Brief introduction to data visualization
Brief introduction to data visualizationBrief introduction to data visualization
Brief introduction to data visualizationZach Gemignani
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Derek Kane
 
Practicing Data Science: A Collection of Case Studies
Practicing Data Science: A Collection of Case StudiesPracticing Data Science: A Collection of Case Studies
Practicing Data Science: A Collection of Case StudiesKNIMESlides
 
Approaches to Sentiment Analysis
Approaches to Sentiment AnalysisApproaches to Sentiment Analysis
Approaches to Sentiment AnalysisNihar Suryawanshi
 
Prediction of Car Price using Linear Regression
Prediction of Car Price using Linear RegressionPrediction of Car Price using Linear Regression
Prediction of Car Price using Linear Regressionijtsrd
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Edureka!
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionsaba khan
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and AnalyticsSrinath Perera
 
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierRegression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierAl Arizmendez
 
Predict price of car from Vehicles Dataset
Predict price of car from Vehicles DatasetPredict price of car from Vehicles Dataset
Predict price of car from Vehicles DatasetSumit Saini
 
Machine Learning Course | Edureka
Machine Learning Course | EdurekaMachine Learning Course | Edureka
Machine Learning Course | EdurekaEdureka!
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data ScienceDataWorks Summit
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine LearningMostafa
 

What's hot (20)

Understanding big data and data analytics big data
Understanding big data and data analytics big dataUnderstanding big data and data analytics big data
Understanding big data and data analytics big data
 
Presentation on Sentiment Analysis
Presentation on Sentiment AnalysisPresentation on Sentiment Analysis
Presentation on Sentiment Analysis
 
Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science Machine Learning Deep Learning AI and Data Science
Machine Learning Deep Learning AI and Data Science
 
1. Data Analytics-introduction
1. Data Analytics-introduction1. Data Analytics-introduction
1. Data Analytics-introduction
 
Brief introduction to data visualization
Brief introduction to data visualizationBrief introduction to data visualization
Brief introduction to data visualization
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
Practicing Data Science: A Collection of Case Studies
Practicing Data Science: A Collection of Case StudiesPracticing Data Science: A Collection of Case Studies
Practicing Data Science: A Collection of Case Studies
 
Twitter sentiment analysis ppt
Twitter sentiment analysis pptTwitter sentiment analysis ppt
Twitter sentiment analysis ppt
 
Business Analytics
 Business Analytics  Business Analytics
Business Analytics
 
Introduction to Data Analytics
Introduction to Data AnalyticsIntroduction to Data Analytics
Introduction to Data Analytics
 
Approaches to Sentiment Analysis
Approaches to Sentiment AnalysisApproaches to Sentiment Analysis
Approaches to Sentiment Analysis
 
Prediction of Car Price using Linear Regression
Prediction of Car Price using Linear RegressionPrediction of Car Price using Linear Regression
Prediction of Car Price using Linear Regression
 
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierRegression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
 
Predict price of car from Vehicles Dataset
Predict price of car from Vehicles DatasetPredict price of car from Vehicles Dataset
Predict price of car from Vehicles Dataset
 
Machine Learning Course | Edureka
Machine Learning Course | EdurekaMachine Learning Course | Edureka
Machine Learning Course | Edureka
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
Azure Machine Learning
Azure Machine LearningAzure Machine Learning
Azure Machine Learning
 

Similar to Data Analytics Project Presentation

Using machine learning to generate predictions based on the information extra...
Using machine learning to generate predictions based on the information extra...Using machine learning to generate predictions based on the information extra...
Using machine learning to generate predictions based on the information extra...University Politehnica Bucharest
 
CARVANA - Predicting the purchase quality in car
CARVANA - Predicting the  purchase quality in carCARVANA - Predicting the  purchase quality in car
CARVANA - Predicting the purchase quality in carShankarPrasaadRajama
 
Multiple regression analysis
Multiple regression analysisMultiple regression analysis
Multiple regression analysisDushyant Bheda
 
Strategic Analysis of Global Low-cost Truck Maket: A Brief Summary
Strategic Analysis of Global Low-cost Truck Maket: A Brief SummaryStrategic Analysis of Global Low-cost Truck Maket: A Brief Summary
Strategic Analysis of Global Low-cost Truck Maket: A Brief SummarySandeep Kar
 
Bluebook of Maruti Suzuki
Bluebook of Maruti SuzukiBluebook of Maruti Suzuki
Bluebook of Maruti SuzukiAakash Gupta
 
Afs chicago motorshow_february2015
Afs chicago motorshow_february2015Afs chicago motorshow_february2015
Afs chicago motorshow_february2015Sam Fiorani
 
Week 06Conjoint Analysishttpswww.smh.com.au.docx
Week 06Conjoint Analysishttpswww.smh.com.au.docxWeek 06Conjoint Analysishttpswww.smh.com.au.docx
Week 06Conjoint Analysishttpswww.smh.com.au.docxjessiehampson
 
Darden School of Business Tesla Strategic Analysis
Darden School of Business   Tesla Strategic AnalysisDarden School of Business   Tesla Strategic Analysis
Darden School of Business Tesla Strategic AnalysisJosé Ángel Álvarez Fuente
 
SmirnovGarciaFinalProject
SmirnovGarciaFinalProjectSmirnovGarciaFinalProject
SmirnovGarciaFinalProjectDenis Smirnov
 
Car Price Trends in www.carfax.com
Car Price Trends in www.carfax.comCar Price Trends in www.carfax.com
Car Price Trends in www.carfax.comJijo Johny
 
Deriving insights from data using "R"ight way
Deriving insights from data using "R"ight wayDeriving insights from data using "R"ight way
Deriving insights from data using "R"ight wayGaurav Shrivastav
 
Electric Vehicles.pptx
Electric Vehicles.pptxElectric Vehicles.pptx
Electric Vehicles.pptxharisankr
 
Journal ArticleSales and Dealership Size as a Pred.docx
Journal ArticleSales and Dealership Size as a Pred.docxJournal ArticleSales and Dealership Size as a Pred.docx
Journal ArticleSales and Dealership Size as a Pred.docxcroysierkathey
 
Abb evi presentation 2018 edwin zorrilla-electromovilidad
Abb evi presentation 2018   edwin zorrilla-electromovilidadAbb evi presentation 2018   edwin zorrilla-electromovilidad
Abb evi presentation 2018 edwin zorrilla-electromovilidadRAFAELFLORES167
 

Similar to Data Analytics Project Presentation (20)

Using machine learning to generate predictions based on the information extra...
Using machine learning to generate predictions based on the information extra...Using machine learning to generate predictions based on the information extra...
Using machine learning to generate predictions based on the information extra...
 
CARVANA - Predicting the purchase quality in car
CARVANA - Predicting the  purchase quality in carCARVANA - Predicting the  purchase quality in car
CARVANA - Predicting the purchase quality in car
 
Multiple regression analysis
Multiple regression analysisMultiple regression analysis
Multiple regression analysis
 
Strategic Analysis of Global Low-cost Truck Maket: A Brief Summary
Strategic Analysis of Global Low-cost Truck Maket: A Brief SummaryStrategic Analysis of Global Low-cost Truck Maket: A Brief Summary
Strategic Analysis of Global Low-cost Truck Maket: A Brief Summary
 
Bluebook of Maruti Suzuki
Bluebook of Maruti SuzukiBluebook of Maruti Suzuki
Bluebook of Maruti Suzuki
 
Afs chicago motorshow_february2015
Afs chicago motorshow_february2015Afs chicago motorshow_february2015
Afs chicago motorshow_february2015
 
SQL
SQLSQL
SQL
 
Cheng_J_B14
Cheng_J_B14Cheng_J_B14
Cheng_J_B14
 
Factorial Design analysis
Factorial Design analysisFactorial Design analysis
Factorial Design analysis
 
Week 06Conjoint Analysishttpswww.smh.com.au.docx
Week 06Conjoint Analysishttpswww.smh.com.au.docxWeek 06Conjoint Analysishttpswww.smh.com.au.docx
Week 06Conjoint Analysishttpswww.smh.com.au.docx
 
Darden School of Business Tesla Strategic Analysis
Darden School of Business   Tesla Strategic AnalysisDarden School of Business   Tesla Strategic Analysis
Darden School of Business Tesla Strategic Analysis
 
Database Query Design
Database Query DesignDatabase Query Design
Database Query Design
 
SmirnovGarciaFinalProject
SmirnovGarciaFinalProjectSmirnovGarciaFinalProject
SmirnovGarciaFinalProject
 
Car Price Trends in www.carfax.com
Car Price Trends in www.carfax.comCar Price Trends in www.carfax.com
Car Price Trends in www.carfax.com
 
Deriving insights from data using "R"ight way
Deriving insights from data using "R"ight wayDeriving insights from data using "R"ight way
Deriving insights from data using "R"ight way
 
Electric Vehicles.pptx
Electric Vehicles.pptxElectric Vehicles.pptx
Electric Vehicles.pptx
 
Journal ArticleSales and Dealership Size as a Pred.docx
Journal ArticleSales and Dealership Size as a Pred.docxJournal ArticleSales and Dealership Size as a Pred.docx
Journal ArticleSales and Dealership Size as a Pred.docx
 
Electrical Vehicle.pptx
Electrical Vehicle.pptxElectrical Vehicle.pptx
Electrical Vehicle.pptx
 
Abb evi presentation 2018 edwin zorrilla-electromovilidad
Abb evi presentation 2018   edwin zorrilla-electromovilidadAbb evi presentation 2018   edwin zorrilla-electromovilidad
Abb evi presentation 2018 edwin zorrilla-electromovilidad
 
Jiyaul 2017
Jiyaul 2017Jiyaul 2017
Jiyaul 2017
 

Recently uploaded

How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17Celine George
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Basic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationBasic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationNeilDeclaro1
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsSandeep D Chaudhary
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Philosophy of china and it's charactistics
Philosophy of china and it's charactisticsPhilosophy of china and it's charactistics
Philosophy of china and it's charactisticshameyhk98
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSAnaAcapella
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 
Simple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdfSimple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdfstareducators107
 

Recently uploaded (20)

How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17How to Manage Call for Tendor in Odoo 17
How to Manage Call for Tendor in Odoo 17
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Basic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationBasic Intentional Injuries Health Education
Basic Intentional Injuries Health Education
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Philosophy of china and it's charactistics
Philosophy of china and it's charactisticsPhilosophy of china and it's charactistics
Philosophy of china and it's charactistics
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Simple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdfSimple, Complex, and Compound Sentences Exercises.pdf
Simple, Complex, and Compound Sentences Exercises.pdf
 

Data Analytics Project Presentation

  • 1. “Used Car E-Commerce Stats for Buyers, Sellers & Advertisers” Prepared by, Rohit G. Vaze rvaze@hawk.iit.edu Group number: 175
  • 2. Introduction • High increase in used (2nd hand) car sales in recent years • Used cars: Fairly good condition, low cost • E-commerce websites have triggered used car sales • Popular E-commerce website for selling used cars: eBay • Which are the popular brands in the used car market? • Which are the popular vehicle types in the used car market? • The selling price (ad price) of a used car depends on what factors? • How will the used car prices be in the future?
  • 3. Data set Used car data set (header and 20,000 rows)
  • 4. Checking for the missing values No missing values
  • 5. Top selling vehicle types in the used car market Sedans = 9,416 SUVs = 6,542
  • 6. Boxplot Median Sedan = $3,999 SUV = $4,500 Mean Sedan = $5,196.55 SUV = 6,116.27
  • 7. Hypothesis testing Interpretation As, (p-value = 1) > (alpha = 0.05) We accept the null hypothesis Hence, we can say Mean selling price of sedans in the used car market is less than the mean selling price of SUV in the used car market
  • 8. Top selling brands in the used car market Toyota = 7,919 units Honda = 4,488 units
  • 9. Boxplot Median Toyota = $5,000 Honda = $3,150 Mean Toyota = $6,961.88 Honda = $3,997.25
  • 10. Hypothesis testing Interpretation As, (p-value < 2.2e-16) < (alpha = 0.05) We reject the null hypothesis Hence, we can say Mean selling price of Toyota cars in the used car market is higher than the mean selling price of Honda cars in the used car market
  • 11. Multiple linear regression • We will check linear association between variables • Divide data into training and testing data sets • We will eliminate the insignificant variables and determine the variables that significantly affect the dependent variable “price” • Residual analysis
  • 13. Values of the correlations High correlation powerPS, price = 0.4812459 gearbox, price = 0.2236955 vehicletype, price = 0.119897 fueltype, price = 0.1153594
  • 14. Eliminating the insignificant independent variables
  • 15. Significant independent variables Significant independent variables • vehicletype • gearbox • powerPS • kilometer • fueltype • brand • postalcode
  • 16. Using the function: step(full, direction=“backward”,trace=T) Expression of the regression model: Y = 5306.79366 + 381.39684 * vehicletype + 461.67136 * gearbox + 50.24259*powerPS – 0.06181*kilometer + 1015.12720 * fueltype – 205.69740*brand + 0.01067*postalcode
  • 17. VIF values vehicletype = 1.028207 gearbox = 1.166104 powerPS = 1.190127 kilometer = 1.019822 fueltype = 1.060202 brand = 1.081085 postalcode = 1.008306 VIF < 5, No collinearity problem
  • 18. Checking the normality of the given data Predicted vs Residual Plot Since, majority data points are concentrated around the regression line, we can say that the data is normally distributed Predicted vs Residual Plot As majority data points lie on the regression line we can say that the data is normally distributed
  • 19. Residual analysis and predictions Choosing the best model with least RMSE value
  • 20. ANOVA (for the independent variable ‘vehicletype’) Null hypothesis: Group means price of all cars with different vehicletypes are equal Alternative hypothesis: Group means price of all cars with different vehicletypes are not equal Interpretation: (p-value < 2e-16) < (alpha = 0.05) Hence, we reject null hypothesis Hence, we can say that group means of all cars with different vehicletypes are not equal
  • 21. ANOVA (for the independent variable ‘brand’) Null hypothesis: Group means price of all cars with different brands are equal Alternative hypothesis: Group means price of all cars with different brands are not equal Interpretation: (p-value < 2e-16) < (alpha = 0.05) Hence, we reject null hypothesis Hence, we can say that group means of all cars with different brands are not equal
  • 22. ANOVA (for the independent variable ‘fueltype’) Null hypothesis: Group means price of all cars with different fueltypes are equal Alternative hypothesis: Group means price of all cars with different fueltypes are not equal Interpretation: (p-value < 2e-16) < (alpha = 0.05) Hence, we reject null hypothesis Hence, we can say that group means of all cars with different fueltypes are not equal
  • 23. Time series analysis and forecasting • Loading the data set and required libraries • Normality test: Histogram, QQ plot, Jarque-Bera test • Ljung box test, ACF plots, PACF plots • Differencing • Build different models AR, MA, ARMA, ARIMA • Future predictions
  • 24. Jarque-bera test and histogram and QQ plot before differencing As, (p-value < 2.2e-16) < (alpha = 0.05), we can state that the distribution is normal
  • 25. Time series plots Mean and variance look constant with time: Stationary series
  • 26. Normality test on differenced time series object Majority points lie on the line. Hence, we can say that the distribution is normal
  • 27. Checking for serial correlation Null hypothesis: Series is not correlated and autocorrelations of time series object is zero Alternative hypothesis: Series is correlated As, p-value < (alpha = 0.05) We reject the null hypothesis Hence, We can say that serial correlation exists
  • 28. Selecting the best model based on AIC value Model using EACF: AIC = 981.97
  • 29. Selecting the best model based on AIC value Model using AR: AIC = 1124.99 Model using MA: AIC = 979.32 Model MA is the best model as the AIC value is the lowest in its case
  • 30. Plots
  • 32. Residual analysis for MA model Ljung-box test result states that the residuals is independent (close to white noise series) Our model is adequate
  • 34. Fuel type preferred Buyers of the used cars trust Petrol cars (11,989 out of 20,000) Diesel cars (7,705 out of 20,000) Electric cars (249 out of 20,000)
  • 35. Conclusion • Top 3 popular vehicle types in the used car market: 1. Sedan 2. SUV 3. Cabriolet • Top 5 popular brands in the used car market: 1. Toyota 2. Honda 3. Ford 4. Mercedes-Benz 5. BMW • Mean selling price of the sedans in the used car market is less than the mean selling price of SUVs in the used car market • Mean selling price of Toyota cars in the used car market is higher than the mean selling price of Honda cars in the used car market. Despite that Toyota cars sell more than the Honda cars (in the used car market). Hence, we can say that Toyota cars are more reliable, better built • Power of a car, the type of gearbox, the fuel type are important factors that influence the selling price of a car in the used car market • Petrol cars sell the most (60%) in the used car market followed by diesel cars • Forecast suggests that the selling price of the used cars will fairly be in the same range as of now in the future