Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Leveraging Machine Learning Techniques for
the Vehicle Auction Industry
Raji Balasubramaniyan, PhD
Senior Data Scientist
M...
Overview
• Automobile auction
– Manheim
• Introduce the ML use cases
– Churn rate
– recommendations
– Forecasting
• How to...
Manheim, Inc., Automobile auction
Providing auction services for the physical sale of
automobiles as well as online tools ...
ML use case 1: Predicting Churn rate
• What is Churn?
– Churn rate, refers to the proportion of members who leave during a...
Predicting Churn rate: The approach
• Step 1
– Create profile for current and cancelled members by collecting their
behavi...
Algorithms: Unsupervised K-means clustering
• Given a set of observations (x1, x2, …, xn), where each
observation is a d-d...
Algorithms: Logistic regression
Manheim | Proprietary and Confidential 7
If P is viewed as a linear function of an explana...
Algorithms: Neural net
Manheim | Proprietary and Confidential 8
Given a specific task to assign a user in a group, given 5...
Algorithms :Sentiment analysis
Manheim | Proprietary and Confidential 9
Sentiment refers to the use of natural language pr...
The Result
• Every dealer will be assigned to a group
• He / She will have 3 different health score (1-Churn rate)
– 0-30 ...
ML use case 2: Recommendation
Manheim | Proprietary and Confidential 11
What is recommendation system?
Recommender systems...
Recommendation: The Approach
Manheim | Proprietary and Confidential 12
• Step 1
– Segment customers according their transa...
The approach: Segment the customers
Manheim | Proprietary and Confidential 13
Segment the customers according to their beh...
The approach :Creating user profile and
Matching
• Create user profiles by collecting the dealer transaction pattern for a...
The approach: Ranking scores using regression
Customer need score
Once we have filtered the profiles that are relevant to ...
ML use case 3: Forecasting
• How many transaction a buyer is going to make in next few
weeks?
– Given the past year transa...
Synopsis : Time series and ARIMA
Manheim | Proprietary and Confidential 17
A time series can be viewed as a combination of...
The Approach :ARIMA
Auto Regressive Integrated moving average model for calculating the forecast,
A non seasonal ARIMA mod...
Manheim | Proprietary and Confidential 19
perioid− Example4−c(0, 0, 0),S(1,0,0)
Weeks
count
0 20 40 60 80 100
400005000060...
Summary
• We used various ML techniques and implemented them for
vehicle auction industry use cases.
• Choosing the algori...
Acknowledgement
• Dr. Stephane Pinel
• Sonar Team
• Manheim
Manheim | Proprietary and Confidential 21
Q &A
Manheim | Proprietary and Confidential 22
Upcoming SlideShare
Loading in …5
×

Raji Balasuubramaniyan, Senior Data Scientist, Manheim at MLconf ATL - 9/18/15

752 views

Published on

Leveraging Machine Learning Techniques for Vehicle Auction Industry: Online shopping has grown in popularity over the years. Nowadays many shoppers turn to online shopping sites for shopping. By recommending those content that is relevant to the online shoppers we are minimizing the time they spent online and maximizing the business success of online shopping sites. Many online sites use recommendation systems nowadays and they leverage content based and or context based collaborative filtering machine learning techniques for this purpose. We have leveraged the power of few machine-learning techniques like collaborative filtering, neural networks, Bayesian learning for relevant content vehicle recommendation and time series forecasting for vehicle auction at Manheim. My talk will focus on some of these techniques and their uses on relevant content recommendation.

Published in: Technology
  • Be the first to comment

Raji Balasuubramaniyan, Senior Data Scientist, Manheim at MLconf ATL - 9/18/15

  1. 1. Leveraging Machine Learning Techniques for the Vehicle Auction Industry Raji Balasubramaniyan, PhD Senior Data Scientist Manheim, Inc., Manheim | Proprietary and Confidential 1
  2. 2. Overview • Automobile auction – Manheim • Introduce the ML use cases – Churn rate – recommendations – Forecasting • How to approach a problem? – Tools and algorithms used • QA
  3. 3. Manheim, Inc., Automobile auction Providing auction services for the physical sale of automobiles as well as online tools to connect wholesale vehicle buyers and sellers. Leader in wholesale vehicle auction industry. 85% vehicle auction business happens at Manheim. We have over 100 location across US and Canada About 15 million cars goes through auction every year
  4. 4. ML use case 1: Predicting Churn rate • What is Churn? – Churn rate, refers to the proportion of members who leave during a given time period • Motto: Make customer happy – If the customer is happy, he/she wont churn. • Why it is important? – It helps us predict and analyze the parameters that drives the customers away helps sales force team to focus on those parameters and coach the customer Manheim | Proprietary and Confidential 4
  5. 5. Predicting Churn rate: The approach • Step 1 – Create profile for current and cancelled members by collecting their behavior data for last 6 months • Activity, Transactions, Messages, Response time etc., • Step 2 – Segment the customer according to their behavior • Unsupervised clustering • Step 3 – For every segment perform supervised learning, to select parameters that influence current members Vs. cancelled members • Logistic regression, Neural net • Step 4 – Include sentiment analysis add another score Manheim | Proprietary and Confidential 5
  6. 6. Algorithms: Unsupervised K-means clustering • Given a set of observations (x1, x2, …, xn), where each observation is a d-dimensional real vector consists of each members parameters, k-means clustering aims to partition the n observations into k (≤ n) sets S = {Successful Seller, Successful Buyer, Buyer at risk, Seller at risk, undecided} so as to minimize the within-cluster sum of squares (WCSS). In other words, its objective is to find: • where μi is the mean of points in Si. Manheim | Proprietary and Confidential 6
  7. 7. Algorithms: Logistic regression Manheim | Proprietary and Confidential 7 If P is viewed as a linear function of an explanatory variable, or a linear combination of explanatory variables, then the logistic regression function can be written as Where α1…αn are parameters influencing the churn
  8. 8. Algorithms: Neural net Manheim | Proprietary and Confidential 8 Given a specific task to assign a user in a group, given 5 groups, learning means using a set of factors to find f* ∈ F which solves the task in optimal sense. Our training data consists of N dealers from each group from 5 groups. x1 :Activity x2 : Number of messages x3: Response time xn : etc w1 w2 w3 wn wnå xn Output Our cost function is the mean-squared error, which tries to minimize the average squared error between the network's output.
  9. 9. Algorithms :Sentiment analysis Manheim | Proprietary and Confidential 9 Sentiment refers to the use of natural language processing, text analysis and computational linguistics to identify and extract subjective information in source materials. We used Naïve-Bayes model. We have two training groups G ={ ‘Cancel’, “Member”}, D= Messages Example tk= {“like”, “love”, “hate”, “bad”, “worst” , "interesting-to-me" : "not-interesting-to- me”,…..k-terms} Goal is to find best group for a message D using maximum a posteriori (MAP) group Gmap tk is a term; Dm is the set from ‘Members’; Dmk is the subset that contain tk; Dc is the set from ‘Cancelled Member’; Dck is the subset that contain tk.
  10. 10. The Result • Every dealer will be assigned to a group • He / She will have 3 different health score (1-Churn rate) – 0-30 days health score (Calculated using last 30 days data) – 30-60 days health score (Calculated using last 30-60 days data) – 60+days health score (Calculated using last 60-120 days data) • Sales force will be alarmed to see if a successful user turned to fall in risk category. They will look into the parameter which forced them to be in risk category – Example : Last 30 days less Activity • Marketing team will take risk category users and aim promotion schemes to them Manheim | Proprietary and Confidential 10
  11. 11. ML use case 2: Recommendation Manheim | Proprietary and Confidential 11 What is recommendation system? Recommender systems are a subclass of information filtering system that seek to predict the 'rating' or 'preference' that a user would give to an item. Goal Suggest relevant content to the users
  12. 12. Recommendation: The Approach Manheim | Proprietary and Confidential 12 • Step 1 – Segment customers according their transaction patterns • Step 2 – For every segment create user profile per customer • Step 3 – Match user profile with vehicle profile and arrive at matching score • Step 4 – Rank the relevant content • Step 5 – Combine profile matching and ranking and provide recommendations
  13. 13. The approach: Segment the customers Manheim | Proprietary and Confidential 13 Segment the customers according to their behavior • Franchise dealer, Independent, Wholesaler K-means or any clustering technique could be used for this purpose Our objective is to find best group every dealer belongs to. where μi is the mean of points in Si. and S = {different customer segments}
  14. 14. The approach :Creating user profile and Matching • Create user profiles by collecting the dealer transaction pattern for a period of time • For every user profile perform vehicle filtering using content based collaborative filtering – User – Item collaborative filtering: Relevant content recommendation • Customers who bought car X also bought car Y – 2010 Honda Accord Vs 2010 Toyota Camry – User- User collaborative filtering : You may also like these • Dealer A and Dealer B how much their profiles match Similarity or Co-rating matrix is used to arrive at relevant content matching correlations Manheim | Proprietary and Confidential 14
  15. 15. The approach: Ranking scores using regression Customer need score Once we have filtered the profiles that are relevant to the users, rank/sort the vehicles according to some goal to provide more relevant content on top • Example: Suggest items that makes more profit for the customers in the retail market, in this case regression goal is profit. Where α1…αn can be Buying price from auction, retail selling price, Detailing work done on the cars etc., Result Suggest relevant cars to the dealers when they login to the site
  16. 16. ML use case 3: Forecasting • How many transaction a buyer is going to make in next few weeks? – Given the past year transaction history for a buyer, how many cars the dealer will buy in future few auctions or online. – Which year, make and model the dealer buy? – In which auction, region he will buy. • How many users are going to Churn in next few months? – How many will move from risk category to successful category – How many will move to risk category – How many non active moved to active category Manheim | Proprietary and Confidential 16
  17. 17. Synopsis : Time series and ARIMA Manheim | Proprietary and Confidential 17 A time series can be viewed as a combination of signal and noise, and could have different patterns like, and it could also have a seasonal component. • Mean reversion • The trend will tend to move to the mean over time • Sinusoidal oscillation • Etc., An ARIMA model can be viewed as a “filter” that tries to separate the signal from the noise, and the signal is then extrapolated into the future to obtain forecasts. ARIMA models are, the most general class of models for forecasting a time series.
  18. 18. The Approach :ARIMA Auto Regressive Integrated moving average model for calculating the forecast, A non seasonal ARIMA model is classified as an"ARIMA(p,d,q) model, where: p is the number of autoregressive terms d is the number of non seasonal differences needed for stationarity q is the number of moving average terms. A seasonal ARIMA model is classified as an ARIMA(p,d,q)x(P,D,Q) model, where P=number of seasonal autoregressive (SAR) terms D=number of seasonal differences Q=number of seasonal moving average (SMA) terms According to signal type, we developed automatic forecast parameter prediction algorithm, that choses different p,P, d,D and q,Q values and selects the one which has lowest RMSE value using 80-20 rule. Manheim | Proprietary and Confidential 18
  19. 19. Manheim | Proprietary and Confidential 19 perioid− Example4−c(0, 0, 0),S(1,0,0) Weeks count 0 20 40 60 80 100 400005000060000700008000090000 80/20 Weeks count 0 20 40 60 400005000060000700008000090000 One Example
  20. 20. Summary • We used various ML techniques and implemented them for vehicle auction industry use cases. • Choosing the algorithm determines the success of the results and depending on the use case, various algorithms can be used • Extracting , Cleaning and normalizing the data forms the crucial layer in determining the use case success Manheim | Proprietary and Confidential 20
  21. 21. Acknowledgement • Dr. Stephane Pinel • Sonar Team • Manheim Manheim | Proprietary and Confidential 21
  22. 22. Q &A Manheim | Proprietary and Confidential 22

×