The document analyzes car price trends using data from 2,108 used cars scraped from www.carfax.com. Various regression models were created to predict price based on variables like mileage, engine, fuel type, and MPG. The models were compared using metrics like R-squared, AIC, and BIC. Hypothesis tests found the average Ford Mustang price was not significantly different from $19,000, but the average MPG was significantly lower than 28. Confidence intervals were constructed for price and mileage.
This Presentation utilizes Perceptual Mapping to effectively position Malibu in it\'s market and Conjoint Analysis to uncover the most significant product attributes in the mid-sized sedan category.
During the 2016 National Regional Transportation Conference, Dave Manning, Second Vice Chair of the American Trucking Associations and President of TCW, provided an overview of the state of trucking. Manning addressed emerging issues including safety and other technologies, workforce challenges, and other issues causing trucking to evolve.
Asia Pacific was the largest geographic region accounting for $3.9 billion or 38.6% of the global market. China was the largest country accounting for $2.1 billion or 20.6% of the global automobile trailers market.
Read Report
https://www.thebusinessresearchcompany.com/report/automobile-trailers-global-market-report-2018
Bahrain Tire (Tyre) Market Forecast 2021 - BrochureTechSci Research
According to www.techsciresearch.com report “Bahrain Tire Market Forecast & Opportunities, 2021”, in 2015, the country’s tire market was dominated by the passenger car tire segment, owing to huge and expanding passenger car fleet size.
Comparison of Autotrader, Carfax, Cargurus, Edmunds.com and Other Top Car Buy...Unmetric
Take a deep dive into the social media habits of car buying websites like AutoTrader.com, CarFax, CarMax, Edmunds and others. See which content strategies they use to engage their audiences and what campaigns they ran in 2015-Q3.
This Presentation utilizes Perceptual Mapping to effectively position Malibu in it\'s market and Conjoint Analysis to uncover the most significant product attributes in the mid-sized sedan category.
During the 2016 National Regional Transportation Conference, Dave Manning, Second Vice Chair of the American Trucking Associations and President of TCW, provided an overview of the state of trucking. Manning addressed emerging issues including safety and other technologies, workforce challenges, and other issues causing trucking to evolve.
Asia Pacific was the largest geographic region accounting for $3.9 billion or 38.6% of the global market. China was the largest country accounting for $2.1 billion or 20.6% of the global automobile trailers market.
Read Report
https://www.thebusinessresearchcompany.com/report/automobile-trailers-global-market-report-2018
Bahrain Tire (Tyre) Market Forecast 2021 - BrochureTechSci Research
According to www.techsciresearch.com report “Bahrain Tire Market Forecast & Opportunities, 2021”, in 2015, the country’s tire market was dominated by the passenger car tire segment, owing to huge and expanding passenger car fleet size.
Comparison of Autotrader, Carfax, Cargurus, Edmunds.com and Other Top Car Buy...Unmetric
Take a deep dive into the social media habits of car buying websites like AutoTrader.com, CarFax, CarMax, Edmunds and others. See which content strategies they use to engage their audiences and what campaigns they ran in 2015-Q3.
According to the TechSci Research report, “Nigeria Tire Market Forecast & Opportunities, 2021’’, the tire market in Nigeria is forecast to surpass $615 million by 2021.
Report URL :- https://www.techsciresearch.com/report/nigeria-tire-market-forecast-and-opportunities-2021/793.html
China Commercial Vehicle Tire Market Forecast and Opportunities 2026.pdfTechSci Research
China commercial vehicle tire stood around USD25.59 billion in terms of value in 2020 and is estimated to reach USD54.40 billion in 2026, advancing with a CAGR of 13.21%. Development in transport infrastructure and rise in public transportation driving the growth of the China commercial vehicle tire market in the forecast years. https://bit.ly/3j8PmWD
Asia Pacific was the largest geographic region accounting for $3.6 billion or 38.6% of the global market. China was the largest country accounting for $1.9 billion or 20.6% of the global camper market.
Read Report
https://www.thebusinessresearchcompany.com/report/camper-global-market-report-2018
Performed extensive analysis to classify if the Buy is good or not using various classifiers (KNN, SVM, Decision Tree, Random Forest, AdaBoosting, Bagging, Deep Neural Nets – Tensor Flow).
Performed Data visualization using Tableau, Optimized Overfitting using SMOTE, Hyperparameter tuning using PCA, Optimized Bias and Variance.
Digital Marketing Plan for Euro Car Parts Ltd.
This plan was an assigment for my Professional Diploma in Digital Marketing in the IDM (Institute of Direct and Digital Marketing).
Please be aware: this is not a real plan, it does not contain any company's real information.
The global truck trailer market was valued at $25.3 billion in 2017. Asia Pacific was the largest geographic region accounting for 38% of the global market.
Read Report
https://www.thebusinessresearchcompany.com/report/truck-trailer-global-market-report-2018
UAE Used Car Market Forecast and Opportunities 2016-2026_Report.pdfTechSci Research
UAE used car market was valued at USD9,590.41 million in the year 2020 and it is expected to grow in the forecast period with a CAGR of 13.99% to reach USD20,630.41 million in 2026. CLick : https://bit.ly/3vLrPBd
According to the TechSci Research report, “Denmark Tire Market Forecast & Opportunities, 2021’’, tire market in Denmark is projected to surpass $ 435 million by 2021
Similar to Car Price Trends in www.carfax.com (20)
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
1. Car Trends in www.carfax.com
1
STATISTICAL DATA MINING
ISM6137.901S17
PROJECT REPORT ON
Car Trends in www.carfax.com
AUTHORS:
Jijo Johny
Deepesh Kumar
Pradeep Raj Ooralath
Majed Alghamdi
2. Car Trends in www.carfax.com
2
Contents
Introduction.......................................................................................................................... 4
About www.carfax.com......................................................................................................... 5
Web Scraping ........................................................................................................................ 7
Dataset ................................................................................................................................. 9
Relationships between variables ......................................................................................... 12
Price vs Mileage...........................................................................................................................12
Price vs Engine.............................................................................................................................13
Price vs Fuel-type ................................................................................................................ 14
Price vs MPG_HWY conditioned on Engine ...................................................................................15
Correlation plot of all variables ....................................................................................................16
Regression Models .............................................................................................................. 17
Model 1............................................................................................................................... 17
Model 2............................................................................................................................... 18
Model 3............................................................................................................................... 19
Models 1, 2, 3 Comparison .................................................................................................. 20
Model 4............................................................................................................................... 21
Model 5............................................................................................................................... 22
Model 6............................................................................................................................... 23
Model 7............................................................................................................................... 24
3. Car Trends in www.carfax.com
3
Hypothesis Testing .............................................................................................................. 26
Confidence Interval ............................................................................................................. 28
Conclusion:.......................................................................................................................... 30
4. Car Trends in www.carfax.com
4
Introduction
The aim of this project is to analyze the trends of car prices in www.carfax.com and to
predict the variation of the prices based on their make, model, year of manufacture,
mileage run by the car, type of engine, fuel type, exterior color and interior color. The
data was obtained by scrapping the web site www.carfax.com. The scrapping was done
by using a chrome extension “Web Scraper”. The data obtained was then cleaned using R
and then various models were created in order to analyze the trend of the price and also
to predict the price on the most influencing variable. Many regression models were
created on the data and the models were compared on R squared value, AIC, BIC and
residual plots of each model. We have considered only sedans belonging to the segment.
The data in Carfax related to the below cars were extracted from the website:
1. Honda Accord
2. Toyota Camry
3. Honda Civic
4. Hyundai Elantra
5. Ford Fusion
6. Kia Optima
7. Chevrolet Malibu
8. Nissan Altima
9. Ford Mustang
5. Car Trends in www.carfax.com
5
About www.carfax.com
Carfax, Inc. is a commercial web-based service that supplies vehicle history reports to
individuals and businesses on used cars and light trucks for the American and Canadian
consumers. CARFAX started with a vision - to be the leading source of vehicle history
information for buyers and sellers of used cars. Today, CARFAX has the most
comprehensive vehicle history database available in North America. Millions of
consumers trust CARFAX to provide them with vehicle history information every year.
CARFAX receives information from more than 100,000 data sources including every U.S.
and Canadian provincial motor vehicle agency plus many auto auctions, fire and police
departments, collision repair facilities, fleet management and rental agencies, and more.
CARFAX Vehicle History Reports™ are available on all used cars and light trucks model
year 1981 or later. Using the unique 17-character vehicle identification number (VIN), a
CARFAX Report is instantly generated from their database of over 17 billion records.
Every CARFAX Report contains information that can impact a consumer's decision about
a used vehicle. Some types of information that a CARFAX Report may include are:
Title information, including salvaged or junked titles
Flood damage history
Total loss accident history
6. Car Trends in www.carfax.com
6
Odometer readings
Lemon history
Number of owners
Accident indicators, such as airbag deployments
State emissions inspection results
Service records
Vehicle use (taxi, rental, lease, etc.)
7. Car Trends in www.carfax.com
7
Web Scraping
Web scraping was done with the chrome extension “Web Scraper”. The following
information were extracted from the website for each car:
1. Make
2. Model
3. Year
4. Mileage
5. Price
6. Engine
7. Transmission
8. Fuel type
9. MPG
10.Exterior color
11.Interior color
Web Scraping using “Web Scraper” Chrome extension:
9. Car Trends in www.carfax.com
9
Dataset
Source of Data: www.carfax.com
Number of Variables: 12
Total number of instances: 2108
Variable Type Values
Make Categorical Honda, Toyota, Hyundai, Ford, etc
Model Categorical Accord, Civic, Camry, Mustang, etc
Year Categorical
Price Numeric
Mileage Numeric
Engine Categorical 4,6 and 8 cylinder
Transmission Categorical Manual and automatic
Fuel type Categorical Gasoline, hybrid and flexible fuel type
MPG_CY Numeric MPG in cities
MPG_HWY Numeric MPG in highways
Ext Color Categorical Exterior color
Int Color Categorical Interior color
12. Car Trends in www.carfax.com
12
Relationships between variables
Price vs Mileage
Price and mileage have a negative relationship.
13. Car Trends in www.carfax.com
13
Price vs Engine
1. 8 cylinder engines are generally more expensive.
2. 4 cylinder and 6 cylinder engines have similar price mostly.
14. Car Trends in www.carfax.com
14
Price vs Fuel-type
Flexible fuel type cars are generally less expensive than hybrid and gasoline cars.
15. Car Trends in www.carfax.com
15
Price vs MPG_HWY conditioned on Engine
MPG_HWY is lower for 8 cylinder cars but the price is generally higher because they are
more powerful.
16. Car Trends in www.carfax.com
16
Correlation plot of all variables
17. Car Trends in www.carfax.com
17
Regression Models
Model 1
25. Car Trends in www.carfax.com
25
Prediction
• smp_size <- floor(0.40*nrow(d))
• set.seed(123)
• train_ind <- sample(seq_len(nrow(d)),size=smp_size)
• train <- d[train_ind,]
• test <- d[-train_ind,]
• pred<- predict.lm(mod9,test,type="response")
• err <- pred-test$Price
• pred.prob <- err <2000
• table(pred.prob)
Model Wrongly Predicted Correctly Predictioed Accuracy
mod 5 211 1054 83.3202
mod 6 210 1055 83.3992
mod 7 183 1082 85.5336
26. Car Trends in www.carfax.com
26
Hypothesis Testing
1. Carfax claims that the average price of the Ford Mustang cars they sold in the
past year is $19000. KBB, the main competitor of Carfax, believed that this value
is lower than the actual average because the average price of the Ford Mustang
cars that KBB sold in the previous year was more than $20000. So, they decided
to prove that Carfax is lying by taking a sample. (Significance level : 5%)
Ha : Average Price > $19000
H0: Average Price <= $19000
> M <- mean(mus$Price); M;
[1] 19381.24
> n <- length(mus$Price); n;
[1] 352
> s <- sd(mus$Price); s;
[1] 7927.969
> se <- s/sqrt(n); se;
[1] 422.5621
> Z <- (M - 19000)/se; Z;
[1] 0.9022073
> P <- pnorm(Z); P;
[1] 0.8165266
Since, P > Significance level, we don’t have enough evidence to reject H0. Hence,
we cannot conclude that the average price is greater than $18000.
27. Car Trends in www.carfax.com
27
2. Carfax claims that the Ford Mustang cars they sell has an average MPG of 28 in
highways. KBB wants to prove that this claim is false (Significance level = 0.05).
Ha : Average MPG < 28
H0: Average MPG >=28
> M <- mean(mus$MPG_HWY); M;
[1] 26.52841
> n <- length(mus$MPG_HWY); n;
[1] 352
> s <- sd(mus$MPG_HWY); s;
[1] 2.915581
> se <- s/sqrt(n); se;
[1] 0.155401
> Z <- (M - 28)/se; Z;
[1] -9.469635
> P <- pnorm(Z); P;
[1] 1.404113e-21
Since, P < Significance level, we have enough evidence to reject H0. Hence, we
can conclude that the average MPG in highways is less than 28.
28. Car Trends in www.carfax.com
28
Confidence Interval
1. Constructing a 95% confidence interval for the average price of Ford Mustang.
Based on this confidence interval, what is the minimum price that a buyer can
expect (with 95% confidence)?
> M <- mean(mus$Price); M;
[1] 19381.24
> n <- length(mus$Price); n;
[1] 352
> s <- sd(mus$Price); s;
[1] 7927.969
> se <- s/sqrt(n); se;
[1] 422.5621
> moe <- qnorm(0.975) * se; moe;
[1] 828.2066
> ci <- M + c(-moe, moe); ci;
[1] 18553.03 20209.45
Minimum price that a buyer can expect is $18553
2. Constructing a 95% confidence interval for the average mileage of Ford Mustang.
Based on this confidence interval, what is the minimum mileage that a buyer can
expect (with 95% confidence)?
> M <- mean(mus$Mileage); M;
[1] 43469.36
> n <- length(mus$Mileage); n;
[1] 352
> s <- sd(mus$Mileage); s;
[1] 33745.19
> se <- s/sqrt(n); se;
[1] 1798.625
> moe <- qnorm(0.975) * se; moe;
[1] 3525.24
> ci <- M + c(-moe, moe); ci;
[1] 39944.12 46994.60
29. Car Trends in www.carfax.com
29
Minimum mileage that a buyer can expect is around 40000 miles.
3. In an annual sales report that Carfax published, they showed that the average price
of the Nissan Altima cars they sold is greater than that of Toyota Camry. Let’s see
whether this is correct by analyzing the sample we collected.
Nissan Altima:
> M <- mean(alt$Price); M;
[1] 13528.43
> n <- length(alt$Price); n;
[1] 366
> s <- sd(alt$Price); s;
[1] 4621.014
> se <- s/sqrt(n); se;
[1] 241.5443
> moe <- qnorm(0.975) * se; moe;
[1] 473.4181
> ci <- M + c(-moe, moe); ci;
[1] 13055.01 14001.85
95% confidence interval: [13055 14002]
Toyota Camry:
> M <- mean(cam$Price); M;
[1] 12268.13
> n <- length(cam$Price); n;
[1] 332
> s <- sd(cam$Price); s;
[1] 4726.381
> se <- s/sqrt(n); se;
[1] 259.3939
> moe <- qnorm(0.975) * se; moe;
[1] 508.4026
> ci <- M + c(-moe, moe); ci;
[1] 11759.73 12776.53
95% confidence interval: [11760 12777]
The confidence intervals of the average prices of Altima and Camry does not overlap.
There, we can conclude that the average price of Altima is greater than that of
Camry, with a 95% confidence.
30. Car Trends in www.carfax.com
30
Conclusion:
Major factors that influence the price of a car are:
• Year: A car that is built recently have a higher price compared to other cars. 2014
built cars have a higher price over others.
• Mileage: A car that ran more fetch a lower price.
• The engine model has a major take in influencing the price.
• Exterior color influences a buyer while purchasing a car.