SlideShare a Scribd company logo
1 of 1
SALES FORCE ALIGNMENT
Ricky Bilakhia
Instructor – Prof. Khasha Dehnad
Evaluation
Plot of Locations Plot of Single Linkage
Plot of Average Linkage Plot of Complete Linkage
•The Clustering algorithm is preferred as it’s unsupervised method. The data doesn’t contain any target variable.
•All the locations are the input to the clustering algorithm for developing clusters containing locations with minimum time
travel. The identification of no target variable eliminates the K-Nearest Neighbor algorithm.
Hierarchical clustering is preferred over k-means,
•As the centroids are not applicable to the locations data and the aim is forming clusters with average time travel.
•K-Means was tried with random points as centroids and the clusters were changing in every iteration. This is also one of
the reasons for not preferring this algorithm.
Shortest Path within the territory
After the formation of balanced sales territory, there comes a need for finding the travel paths between locations.
The Algorithm calculates the path by taking the Source location along with the other locations which needs to be covered. It
then compares the time taken to travel from the source to all the other locations. This process continues until all the
locations are ordered based on the travel time.
Functions like geom_leg() and qmap() are used for the representation of the travel path.
Recommendation
As seen in Statistical Model, H-cluster does not provide balanced sales territories with single and complete linkage mode.
So, we have come up with advanced K-Means which deals with usage of H-cluster to form the initial clusters and then K-
Means to form balanced sales territories.
Initial Division of Territory Final Division of Territory
Steps we followed to get balanced territory:
•Using H-Cluster single linkage we divided the territory depending on the number of K(k =2).
•Then we prepared an algorithm (using k-means theory), and the algorithm works in following manner:
a) Randomly select cluster centers.
b) Calculate the distance between each data point and cluster centers.
c) Assign the data point to the cluster center whose distance from the cluster center is minimum of all the cluster centers.
d) Recalculate the new cluster center
e) Recalculate the distance between each data point and new obtained cluster centers.
f) If no data point is reassigned then stop, otherwise repeat from step c.
Business Intelligence & Analytics
http://www.stevens.edu/howe/academics/graduate/business-intelligence-analytics
Motivation
Define and align the most effective and balanced sales territories based on
geography and physician prescribing habits, in order to maximize promotion of new
medications.
Pharmaceutical companies spend millions of dollars annually on their sales force in
order to promote recently discovered medications, thereby increasing their company’s
revenue.
The sales force is aligned with geographical territories that are defined to maximize
representative effectiveness and reach, and minimize travel time, while providing
“fair” and approximately equal sales potential for each representative.
Technology
R for developing datamining , statistical and scoring models (e.g. market segmentations and
alignments).
Google Maps API for acquiring travel times between locations .
Excel for input and output delivery mechanisms.
Current & Future Work
Make travel plans for the representatives so that they can cover their targets in the shortest time
frame.
Optimize alignment of “sales force” in order to maximize “sales force effectiveness” and minimize
expenses.
Formation of sales territories based on the Physicians’ score/ranking. The scoring/ranking is modeled
based on the potential number of written prescriptions, using physicians’ and their patients’ profiles
(e.g. physicians specialty, patient pool and demographics).
Future work includes:
 Developing a “recommender “ system for the sales force.
 Overlapping sales forces (multiple sales forces covering the same territory).
 Improving scoring models (e.g. utilization of neural networks).
Statistical Model
Latitude and longitude of all the addresses in the data are found out and distinct locations were extracted.
Alldata_df -> initial data
ddply() -> get distinct location
There are n locations and n^2 location combinations are formed where average time taken to travel between
locations are got.
R Package – ggmap
Mapdist() -> Finding out time travel
Mode – driving / walking / cycling
Clustering
Hierarchical Clustering to form sales territories:
•As the distance matrix is a self-defined distance function which contains the average time taken to travel from
one location to another, as.dist() function is used in hierarchical clustering.
•The cutree() functionality is implemented for dividing the tree formed by hclust, by specifying the number of
groups(k) and the height where the tree needs to be cut (h).
Here the hclust_output represents the cluster formed with hclust()
K= 2 represents the number of groups that the cluster needs to be split.
Method -> single/average/complete
distMatrix – Distance data matrix
Average Matrix
K-Means:
Google Maps Package – ggmap
geocode - getting latitude and longitude of a location
get_googlemap – accesses the Google Static Maps API to download a static map
geom_text - displays the location label of the corresponding markers in the map

More Related Content

What's hot

What's hot (20)

Components of Spatial Data Quality in GIS
Components of Spatial Data Quality in GISComponents of Spatial Data Quality in GIS
Components of Spatial Data Quality in GIS
 
Theories and Applications of Spatial-Temporal Data Mining and Knowledge Disco...
Theories and Applications of Spatial-Temporal Data Mining and Knowledge Disco...Theories and Applications of Spatial-Temporal Data Mining and Knowledge Disco...
Theories and Applications of Spatial-Temporal Data Mining and Knowledge Disco...
 
Spatial data analysis 1
Spatial data analysis 1Spatial data analysis 1
Spatial data analysis 1
 
Gis Concepts 5/5
Gis Concepts 5/5Gis Concepts 5/5
Gis Concepts 5/5
 
3D Analyst
3D Analyst3D Analyst
3D Analyst
 
GIS work sample
GIS work sampleGIS work sample
GIS work sample
 
Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )Spatial analysis and Analysis Tools ( GIS )
Spatial analysis and Analysis Tools ( GIS )
 
Business Analytics Foundation with R Tools - Part 3
Business Analytics Foundation with R Tools - Part 3Business Analytics Foundation with R Tools - Part 3
Business Analytics Foundation with R Tools - Part 3
 
TYBSC IT PGIS Unit IV Spacial Data Analysis
TYBSC IT PGIS Unit IV  Spacial Data AnalysisTYBSC IT PGIS Unit IV  Spacial Data Analysis
TYBSC IT PGIS Unit IV Spacial Data Analysis
 
Uncertainty aware multidimensional ensemble data visualization and exploration
Uncertainty aware multidimensional ensemble data visualization and explorationUncertainty aware multidimensional ensemble data visualization and exploration
Uncertainty aware multidimensional ensemble data visualization and exploration
 
Gis Concepts 4/5
Gis Concepts 4/5Gis Concepts 4/5
Gis Concepts 4/5
 
GIS in land suitability mapping
GIS in land suitability mappingGIS in land suitability mapping
GIS in land suitability mapping
 
Spatial Data Model 2
Spatial Data Model 2Spatial Data Model 2
Spatial Data Model 2
 
Business Analytics Foundation with R tools - Part 2
Business Analytics Foundation with R tools - Part 2Business Analytics Foundation with R tools - Part 2
Business Analytics Foundation with R tools - Part 2
 
Geographical information system unit 5
Geographical information  system unit 5Geographical information  system unit 5
Geographical information system unit 5
 
QUERY AND NETWORK ANALYSIS IN GIS
QUERY AND NETWORK ANALYSIS IN GISQUERY AND NETWORK ANALYSIS IN GIS
QUERY AND NETWORK ANALYSIS IN GIS
 
Sql Server 2008 Spatial Analysis
Sql Server 2008 Spatial AnalysisSql Server 2008 Spatial Analysis
Sql Server 2008 Spatial Analysis
 
SQL Server 2008 Spatial Data - Getting Started
SQL Server 2008 Spatial Data - Getting StartedSQL Server 2008 Spatial Data - Getting Started
SQL Server 2008 Spatial Data - Getting Started
 
Lab_6_Geo-referencing
Lab_6_Geo-referencingLab_6_Geo-referencing
Lab_6_Geo-referencing
 
TYBSC IT PGIS Unit II Chapter I Data Management and Processing Systems
TYBSC IT PGIS Unit II Chapter I Data Management and Processing SystemsTYBSC IT PGIS Unit II Chapter I Data Management and Processing Systems
TYBSC IT PGIS Unit II Chapter I Data Management and Processing Systems
 

Similar to Sales Force Alignment

Dynamic approach to k means clustering algorithm-2
Dynamic approach to k means clustering algorithm-2Dynamic approach to k means clustering algorithm-2
Dynamic approach to k means clustering algorithm-2
IAEME Publication
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
Nandhini S
 
Unsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and AssumptionsUnsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and Assumptions
refedey275
 

Similar to Sales Force Alignment (20)

Experimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithmsExperimental study of Data clustering using k- Means and modified algorithms
Experimental study of Data clustering using k- Means and modified algorithms
 
Customer segmentation.pptx
Customer segmentation.pptxCustomer segmentation.pptx
Customer segmentation.pptx
 
A survey on Efficient Enhanced K-Means Clustering Algorithm
 A survey on Efficient Enhanced K-Means Clustering Algorithm A survey on Efficient Enhanced K-Means Clustering Algorithm
A survey on Efficient Enhanced K-Means Clustering Algorithm
 
Clustering
ClusteringClustering
Clustering
 
50120140505013
5012014050501350120140505013
50120140505013
 
Exploratory_Analysis_of_Data_ppt.pdf
Exploratory_Analysis_of_Data_ppt.pdfExploratory_Analysis_of_Data_ppt.pdf
Exploratory_Analysis_of_Data_ppt.pdf
 
CLuster analysis presentation.pptx
CLuster analysis presentation.pptxCLuster analysis presentation.pptx
CLuster analysis presentation.pptx
 
Machine Learning Clustering
Machine Learning ClusteringMachine Learning Clustering
Machine Learning Clustering
 
05 k-means clustering
05 k-means clustering05 k-means clustering
05 k-means clustering
 
Ed Snelson. Counterfactual Analysis
Ed Snelson. Counterfactual AnalysisEd Snelson. Counterfactual Analysis
Ed Snelson. Counterfactual Analysis
 
Dynamic approach to k means clustering algorithm-2
Dynamic approach to k means clustering algorithm-2Dynamic approach to k means clustering algorithm-2
Dynamic approach to k means clustering algorithm-2
 
Supply chain network design
Supply chain network designSupply chain network design
Supply chain network design
 
Supply chain network modelling
Supply chain network modellingSupply chain network modelling
Supply chain network modelling
 
Variance rover system web analytics tool using data
Variance rover system web analytics tool using dataVariance rover system web analytics tool using data
Variance rover system web analytics tool using data
 
Variance rover system
Variance rover systemVariance rover system
Variance rover system
 
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizo...
 
Machine Learning statistical model using Transportation data
Machine Learning statistical model using Transportation dataMachine Learning statistical model using Transportation data
Machine Learning statistical model using Transportation data
 
Chapter 5.pdf
Chapter 5.pdfChapter 5.pdf
Chapter 5.pdf
 
CSA 3702 machine learning module 3
CSA 3702 machine learning module 3CSA 3702 machine learning module 3
CSA 3702 machine learning module 3
 
Unsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and AssumptionsUnsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and Assumptions
 

Sales Force Alignment

  • 1. SALES FORCE ALIGNMENT Ricky Bilakhia Instructor – Prof. Khasha Dehnad Evaluation Plot of Locations Plot of Single Linkage Plot of Average Linkage Plot of Complete Linkage •The Clustering algorithm is preferred as it’s unsupervised method. The data doesn’t contain any target variable. •All the locations are the input to the clustering algorithm for developing clusters containing locations with minimum time travel. The identification of no target variable eliminates the K-Nearest Neighbor algorithm. Hierarchical clustering is preferred over k-means, •As the centroids are not applicable to the locations data and the aim is forming clusters with average time travel. •K-Means was tried with random points as centroids and the clusters were changing in every iteration. This is also one of the reasons for not preferring this algorithm. Shortest Path within the territory After the formation of balanced sales territory, there comes a need for finding the travel paths between locations. The Algorithm calculates the path by taking the Source location along with the other locations which needs to be covered. It then compares the time taken to travel from the source to all the other locations. This process continues until all the locations are ordered based on the travel time. Functions like geom_leg() and qmap() are used for the representation of the travel path. Recommendation As seen in Statistical Model, H-cluster does not provide balanced sales territories with single and complete linkage mode. So, we have come up with advanced K-Means which deals with usage of H-cluster to form the initial clusters and then K- Means to form balanced sales territories. Initial Division of Territory Final Division of Territory Steps we followed to get balanced territory: •Using H-Cluster single linkage we divided the territory depending on the number of K(k =2). •Then we prepared an algorithm (using k-means theory), and the algorithm works in following manner: a) Randomly select cluster centers. b) Calculate the distance between each data point and cluster centers. c) Assign the data point to the cluster center whose distance from the cluster center is minimum of all the cluster centers. d) Recalculate the new cluster center e) Recalculate the distance between each data point and new obtained cluster centers. f) If no data point is reassigned then stop, otherwise repeat from step c. Business Intelligence & Analytics http://www.stevens.edu/howe/academics/graduate/business-intelligence-analytics Motivation Define and align the most effective and balanced sales territories based on geography and physician prescribing habits, in order to maximize promotion of new medications. Pharmaceutical companies spend millions of dollars annually on their sales force in order to promote recently discovered medications, thereby increasing their company’s revenue. The sales force is aligned with geographical territories that are defined to maximize representative effectiveness and reach, and minimize travel time, while providing “fair” and approximately equal sales potential for each representative. Technology R for developing datamining , statistical and scoring models (e.g. market segmentations and alignments). Google Maps API for acquiring travel times between locations . Excel for input and output delivery mechanisms. Current & Future Work Make travel plans for the representatives so that they can cover their targets in the shortest time frame. Optimize alignment of “sales force” in order to maximize “sales force effectiveness” and minimize expenses. Formation of sales territories based on the Physicians’ score/ranking. The scoring/ranking is modeled based on the potential number of written prescriptions, using physicians’ and their patients’ profiles (e.g. physicians specialty, patient pool and demographics). Future work includes:  Developing a “recommender “ system for the sales force.  Overlapping sales forces (multiple sales forces covering the same territory).  Improving scoring models (e.g. utilization of neural networks). Statistical Model Latitude and longitude of all the addresses in the data are found out and distinct locations were extracted. Alldata_df -> initial data ddply() -> get distinct location There are n locations and n^2 location combinations are formed where average time taken to travel between locations are got. R Package – ggmap Mapdist() -> Finding out time travel Mode – driving / walking / cycling Clustering Hierarchical Clustering to form sales territories: •As the distance matrix is a self-defined distance function which contains the average time taken to travel from one location to another, as.dist() function is used in hierarchical clustering. •The cutree() functionality is implemented for dividing the tree formed by hclust, by specifying the number of groups(k) and the height where the tree needs to be cut (h). Here the hclust_output represents the cluster formed with hclust() K= 2 represents the number of groups that the cluster needs to be split. Method -> single/average/complete distMatrix – Distance data matrix Average Matrix K-Means: Google Maps Package – ggmap geocode - getting latitude and longitude of a location get_googlemap – accesses the Google Static Maps API to download a static map geom_text - displays the location label of the corresponding markers in the map