Final Semester project on Leveraging Data Analysis for Sales Department using prescriptive and predictive analytics. Predictive analytics using Neural Network and Logistic Regression in R language.
3. • Providing fluidity to the Sales
department
• Better ways of visualization and
analysis of data
• Increasing the forecasting
efficiency
• Generation of MHAs(Must Have
Accounts)
5. • Tech Mahindra’s involvement in sales
• Tech Mahindra offers innovative and customer-centric information technology
experiences, enabling Enterprises, Associates and the Society to Rise. Their net worth is
USD 4.4 billion company and are spread across 90 countries.
• Tech Mahindra is one of the leaders in the sales environment. The reason of their thriving
nature is their awareness of the fact that companies need to move forward with
Technologies. They believe in constantly adapting, learning, changing and growing.
• Effective use of analytics can do wonders
• In McKinsey’s survey of more than 1,000 sales organizations around the world, we found
that 53 percent of those that are “high performing” rate themselves as effective users of
analytics. Well-designed analytics programs deliver significant top-line and margin growth
by guiding sales teams to better decisions.
• Decision made using analytics > Decision made using analysis
through observation
• Once slow-moving and driven by intuition, data and new analytical techniques have
introduced greater rigor, efficiency, and insight. In many industries, it is the adoption of
advanced analytics that has begun to differentiate the winners from the rest.
7. Understanding the
Business
Data Preparation
Analysis using Rstudio
Descriptive & Predictive
Descriptive Analytics
Predictive Analytics
Logistic Regression Model
Neural Network ModelInferences & Conclusion
• Presents the visualizations of the
current scenario
• Describes what is or what the data
shows comprehensively
• Can prove to be informative while
targeting particular verticals
Considers multiple parameters on
which the deal closure(won/loss) of a
particular account depends and
leverages it to predict whether other
accounts with the same parameters
will be won or lost.
9. • Data preparation is a process to ensure that
information being readied for analysis is
accurate and consistent
• Data is often created with missing values,
inaccuracies or other errors. Additionally,
data sets stored in separate files or databases
often have different formats that need to be
reconciled
• The process of correcting inaccuracies,
performing verification and joining data sets
constitutes a big part of the data preparation
process.
Parameter Meaning
AccountIBG Independent Business Group
AccountIBU Independent Business Unit
AccountID ID of the account
AccountName Name of the opportunity
AccountSBU Sales Business Unit of the Account
City City from where organization functions from
Competency Qualified in doing which task
Country Country from where organization functions from
Date Date when the Account was created
MonthofQtr Which Quarter it is in
OpportunityID ID assigned to the opportunity
OpportunityOwner Name of the Opportunity Owner
OpportunityOwner
GID
Reference Number of Opportunity Owner
Outcomes Account Closed or Lost
OverallValue Revenue Generated/Lost by that Account
PrimaryIndustry Organization involved in what kind of Industry
SecondaryIndustry Secondary Industry Involved in
ServiceOffering Overall division of Services
Tenure Time elapsed since the deal has been active
WeekofQtr Which week of that Quarter it is in
11. 1. Stacked Chart between Number of Accounts Lost/Committed Country Wise
2. Time Series Analysis on Number of Accounts created
The graph represents that
Maximum number of Accounts
were accomplished in United
States and Bolivia but the
highest win: lose ratio is evident
in Hungary.
The time series analysis pastes a
very pellucid picture, that most
years, whenever there was a raise
in the number of accounts
created, it was in the Fourth
Quarter, explaining that most
accounts opened in the last
quarter of a financial year.
12. 3. Time Series Analysis on Number of Accounts created
In the year 2016-2017, it was observed
that the creation of accounts was
stagnant throughout the first half of the
whole year.
4. Distribution of Accounts according to their primary and secondary industries
13. 5. Relationship between SBU and Revenue Generated & Lost
Sales Business Unit of SAL EMEA generates the highest
revenue and more emphasis needs to be applied on the
Sales Cluster, which produces the lowest income
Sales Cluster is doing considerably better than Sales America
where the revenue is 0.75 million but the loss is around 4 million
6. Relationship between Opportunity Owners and Won/loss accounts and Generated/lost revenue respectively
These graphs collectively aid in a comparative performance analysis for Opportunity Owners. For instance, even though P Vijaya
Raghvan has less number of Accounts when compared to Phillip Anthony Armstrong, he generated higher revenue with no money
loss from his accounts. This represents his accuracy and also aids in production of highly skilled employees who can then be
deployed for higher ranking accounts.
14. 6. How the accounts are spread all around the world
7. 3D plot for Country and the corresponding revenue generated and lost
The 3D scatter plot is very interactive as it
shows for every country, the amount generated
and lost helping us gauge the profit percentage
and loss.
16. • Once the model is prepared, we convert the values obtained from the model to probabilities by
using in built functions
• These probabilities help in plotting the ROC (Receiver Operating Characteristic) Curve which
summarized the model’s performance
• The area under curve (AUC), referred to as index of accuracy(A) or concordance index, is a
perfect performance metric for ROC curve. Higher the area under curve, better the prediction
power of the model
• It is used to ascertain the probability of an event. And this event is captured in binary
format, i.e. 0 or 1
• The variable titled “outcomes”, is treated as the dependent variable and others as
independent variables
• Model prepared was checked for multicollinearity and the model with minimum AIC value
was selected
• As per this model, AccountSBU,
AccountIBG, AccountIBU, PrimaryIndustry,
City, Country and OpportunityOwnerGID
have the highest significance for predicting
whether an account will be lost or won
17. • The model is further tested on data containing accounts other than the ones used to
prepare the base model
• ROC curve is obtained from the application of the model to new data
• Area Under the Curve (AUC) was obtained as 0.80, giving an efficiency of 80%
19. • Neural network is an information-processing machine and can be viewed as analogous to
human nervous system
• Information in passed through interconnected units is analogous to information passage
through neurons in humans. The first layer of the neural network receives the raw input,
processes it and passes the processed information to the hidden layers. The hidden layer
passes the information to the last layer, which produces the output. The advantage of
neural network is that it is adaptive in nature. It learns from the information provided, i.e.
trains itself from the data, which has a known outcome and optimizes its weights for a
better prediction in situations with unknown outcome.
• Neural network becomes handy to infer meaning and detect patterns from complex data
sets
20. • The network model is formed
on multiple parameters,
Country, Primary Industry,
Secondary Industry, SBU
Account, IBG Account and
Opportunity Owner
• This is the model to calculate
the Must Have Accounts
(MHA). Country amounts to
be the highest factor
responsible in analysis of Must
Have Accounts followed by
Primary Industry , Secondary
Industry and more
• The accuracy in this sample is noted to be very close to 85%. The accuracy changes depending on the train sample
and test sample as occasionally, the answer is casual upon the account owner
• The RMSE value observed is 0
• The predicted and observed results are very similar
22. • We have been able to formulate a palatable code which can be deployed on
versatile datasets. When the parameters are fulfilled i.e. data has all of them, the
code can be executed which will provide all the aspects mentioned in Result
Analysis. The goal of the project to allow the team to have more information
about the current scenario of sales on multiple verticals, was reached
• Descriptive Analysis – It exhibited multiple relationships between the
parameters of dataset. Analytics present the visualization providing the current
scenario of the company and where they need to move forward on the verticals
n industry acquisition, countries to be targeted, account segregation on product
levels
• Predictive Analysis – The two models focused on a more prospective analysis
which in turn can aid in the current scheduling. The achieved task here was the
generation of MHA i.e. the MUST HAVE ACCOUNTS. Models consider
multiple parameters involved in final decision on the accounts survival and
then leverages it to get an outcome on future accounts
• The significance of the project is to create new opportunities in terms of
accounts, ameliorating skills of Opportunity Owners by being more aware and
continue towards further advancement of the organization in the market