Predictive analytics uses past data to forecast future outcomes. The document discusses various predictive analytics techniques including simple forecasting methods, decision trees, and regression. Simple forecasting techniques like moving averages are easiest to implement but lack explanatory power, while decision trees and regression provide more accurate predictions at an individual level but require more complex deployment. The key is selecting the right technique based on the problem, data, and ability to implement predictive models in real-world applications.
Detailed insight into Analytical Steps required for generating reliable insights from analysis - Univariate, Bivariate, Multivariate, OLS & Logistic Models, etc
Was put together to train friends and mentees. Based on personal learnings/research and no proprietary info, etc. and no claims on 100% accuracy. Also every institution/organization/team uses it own steps/methodologies, so please use the one relevant for you and this only for training purposes.
A Review on Credit Card Default Modelling using Data ScienceYogeshIJTSRD
In the last few years, credit card issuers have become one of the major consumer lending products in the U.S. as well as several other developed nations of the world, representing roughly 30 of total consumer lending USD 3.6 tn in 2016 . Credit cards issued by banks hold the majority of the market share with approximately 70 of the total outstanding balance. Bank’s credit card charge offs have stabilized after the financial crisis to around 3 of the outstanding total balance. However, there are still differences in the credit card charge off levels between different competitors. Harsh Nautiyal | Ayush Jyala | Dishank Bhandari "A Review on Credit Card Default Modelling using Data Science" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Special Issue | International Conference on Advances in Engineering, Science and Technology - 2021 , May 2021, URL: https://www.ijtsrd.com/papers/ijtsrd42461.pdf Paper URL : https://www.ijtsrd.com/engineering/computer-engineering/42461/a-review-on-credit-card-default-modelling-using-data-science/harsh-nautiyal
Default Probability Prediction using Artificial Neural Networks in R ProgrammingVineet Ojha
The objective of the project is to analyze the ability of the Artificial Neural Network Model
developed to forecast the credit risk profile of retails banking loan consumers and credit card
customers.
From a theoretical point of view, this project introduces a literature review on the detailed
working and the application of Artificial Neural Networks for credit risk management.
Practically, the aim of this project is presenting a model for estimating the Probability of Default
using Artificial Neural Network to accrue benefit non-linear models.
Machine Learning Project - Default credit card clients Vatsal N Shah
- The model we built here will use all possible factors to predict data on customers to find who are defaulters and non‐defaulters next month.
- The goal is to find the whether the clients are able to pay their next month credit amount.
- Identify some potential customers for the bank who can settle their credit balance.
- To determine if their customers could make the credit card payments on‐time.
- Default is the failure to pay interest or principal on a loan or credit card payment.
This project aims at predicting Defaulters of Credit Card Payment. R programming is used for Exploratory Data Analysis and for Model building R programming and Azure ML is used.
Predicting Credit Card Defaults using Machine Learning AlgorithmsSagar Tupkar
This is a project that I worked on as a Capstone for my Masters in Business Analytics program at the University of Cincinnati. In this project, I have performed an end-to-end data mining exercise including data cleaning, distribution analysis, exploratory data analysis, model building etc. to identify and predict Credit Card defaults using Customer's data on past payments and general profile. In the process for building Machine Learning models, I have fit and compared the performance of multiple models and algorithms like Logistic Regreesion, PCA, Classification tree, AdaBoost Classifier, ANN and LDA.
Detailed insight into Analytical Steps required for generating reliable insights from analysis - Univariate, Bivariate, Multivariate, OLS & Logistic Models, etc
Was put together to train friends and mentees. Based on personal learnings/research and no proprietary info, etc. and no claims on 100% accuracy. Also every institution/organization/team uses it own steps/methodologies, so please use the one relevant for you and this only for training purposes.
A Review on Credit Card Default Modelling using Data ScienceYogeshIJTSRD
In the last few years, credit card issuers have become one of the major consumer lending products in the U.S. as well as several other developed nations of the world, representing roughly 30 of total consumer lending USD 3.6 tn in 2016 . Credit cards issued by banks hold the majority of the market share with approximately 70 of the total outstanding balance. Bank’s credit card charge offs have stabilized after the financial crisis to around 3 of the outstanding total balance. However, there are still differences in the credit card charge off levels between different competitors. Harsh Nautiyal | Ayush Jyala | Dishank Bhandari "A Review on Credit Card Default Modelling using Data Science" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Special Issue | International Conference on Advances in Engineering, Science and Technology - 2021 , May 2021, URL: https://www.ijtsrd.com/papers/ijtsrd42461.pdf Paper URL : https://www.ijtsrd.com/engineering/computer-engineering/42461/a-review-on-credit-card-default-modelling-using-data-science/harsh-nautiyal
Default Probability Prediction using Artificial Neural Networks in R ProgrammingVineet Ojha
The objective of the project is to analyze the ability of the Artificial Neural Network Model
developed to forecast the credit risk profile of retails banking loan consumers and credit card
customers.
From a theoretical point of view, this project introduces a literature review on the detailed
working and the application of Artificial Neural Networks for credit risk management.
Practically, the aim of this project is presenting a model for estimating the Probability of Default
using Artificial Neural Network to accrue benefit non-linear models.
Machine Learning Project - Default credit card clients Vatsal N Shah
- The model we built here will use all possible factors to predict data on customers to find who are defaulters and non‐defaulters next month.
- The goal is to find the whether the clients are able to pay their next month credit amount.
- Identify some potential customers for the bank who can settle their credit balance.
- To determine if their customers could make the credit card payments on‐time.
- Default is the failure to pay interest or principal on a loan or credit card payment.
This project aims at predicting Defaulters of Credit Card Payment. R programming is used for Exploratory Data Analysis and for Model building R programming and Azure ML is used.
Predicting Credit Card Defaults using Machine Learning AlgorithmsSagar Tupkar
This is a project that I worked on as a Capstone for my Masters in Business Analytics program at the University of Cincinnati. In this project, I have performed an end-to-end data mining exercise including data cleaning, distribution analysis, exploratory data analysis, model building etc. to identify and predict Credit Card defaults using Customer's data on past payments and general profile. In the process for building Machine Learning models, I have fit and compared the performance of multiple models and algorithms like Logistic Regreesion, PCA, Classification tree, AdaBoost Classifier, ANN and LDA.
• Forecasted the Expected Credit Loss, over the lifetime of the mortgage. Built Loan-level PD Model using Markov Chain Transition Matrix and logistic regression with six transition states and validated them using backtesting.
Reduction in customer complaints - Mortgage IndustryPranov Mishra
The project aims at analysis of Customer Complaints/Inquiries received by a US based mortgage (loan) servicing company..
The goal of the project is building a predictive model using the identified significant
contributors and coming up with recommendations for changes which will lead to
1. Reducing Re-work
2. Reducing Operational Cost
3. Improve Customer Satisfaction
4. Improve company preparedness to respond to customer.
Three models were built - Logistic Regression, Random Forest and Gradient Boosting. It was seen that the accuracy, auc (Area under the curve), sensitivity and specificity improved drastically as the model complexity increased from simple to complex.
Logistic regression was not generalizing well to a non-linear data. So the model was suffering from both bias and variance. Random Forest is an ensemble technique in itself and helps with reducing variance to a great extent. Gradient Boosting, with its sequential learning ability, helps reduce the bias. The results from both random forest and gradient boosting did not differ by much. This is confirming the bias-variance trade-off concept which states that complex models will do well on non-linear data as the inflexible simple models will have high bias and can have high variance.
Additionally, a lift chart was built which gives a Cumulative lift of 133% in the first four deciles
Sales Performance Deep Dive and Forecast: A ML Driven Analytics SolutionPranov Mishra
Problem Statement
One of Unilever’s brands is going through a steep decline in revenues and is requiring major changes in business execution plans. The management is expecting a thorough analysis of historical performances culminating in identification of key factors driving sales.
Data Summary and Product Life Cycle Overview
The data provided constituted more than 30 years of information of sales and related variables.
The training data suggested that the product has gone through a life-cycle of launch, growth and maturity. There were indications of a decline phase in the last few periods of training data.
The test data corroborated the indications as we could notice sharp decline (more than 25%) since 2016.
Key Insights & Driver Analysis
The factors having a significant positive impact on sales volumes were identified to be promotion expenditure, volumes produced or in stock, inflation, rainfall and visibility through social search impressions.
The factors having a significant negative impact on sales volumes were identified to be brand equity, competitor prices, fuel price and digital impressions
Forecasting
Multiple approaches were attempted including ARIMA, Holt Winter’s Double Exponential Smoothing, Bayesian approach(BSTS) and LSTM
The best results were achieved when training data was combined with 2 years of test data to capture the decline phases. MAPE of 25% achieved with Holt Winter followed by ARIMA with a mape of 33%.
For the second problem statement that required training on test data only, best results were achieved through the bsts model followed by LSTM. Mapes of 5% and 13% respectively were achieved.
Prediction of customer propensity to churn - Telecom IndustryPranov Mishra
The aim of this project is to help a telecom company with insights on customer behavior that would be useful for retention of customers. The specific goals expected to be achieved are given below
1. Identification of the top variables driving likelihood of churn
2. Build a predictive model to identify customers who have highest probability to terminate services with the company.
3. Build a lift chart for optimization of efforts by targeting most of the potential churns with least contact efforts. Here with 30% of the total customer pool, the model accurately provides 33% of total potential churn candidates.
Models tried to arrive at the best are
1. Simple Models like Logistic Regression & Discriminant Analysis with different thresholds for classification
2. Random Forest after balancing the dataset using Synthetic Minority Oversampling Technique (SMOTE)
3. Ensemble of five individual models and predicting the output by averaging the individual output probabilities
4. Xgboost algorithm
Default credit cards are an important issue that bring negative consequences to both sides, i.e, banks and customer. If a customer does not pay his obligations, banks loose money, the customer will lose credibility in future payments, collection calls start to be made and in last resort, the case may go into the court. In order to avoid all of that trouble, effective methods that are able to predict the default of credit cards are needed. Therefore, default credit card prediction is an important, challenging and useful task that should be addressed.
This presentation documents how the problem can be addressed, following the pipeline of a typical Patter Recognition application. The main task is to classify a set of samples representing the history of payments and bill statements of a given client plus some background information about the client according to its ability to pay or not (Default) the next monthly payment of its credit card.
This slide discuss predictive data analytics models and their applications in broader content. It gives simple examples of regression and classification.
3rd alex marketing club (pharmaceutical forecasting) dr. ahmed sham'aMahmoud Bahgat
#Mahmoud_Bahgat
#Marketing_Club
Join us by WhatsApp to me 00966568654916
*اشترك في صفحة ال Marketing Club* عالفيسبوك
https://www.facebook.com/MarketingTipsPAGE/
*اشترك في جروب ال Marketing Club* عالفيسبوك
https://www.facebook.com/groups/837318003074869/
*Marketing Club Middle East*
25 Meetings in 6 Cities in 1 year & 2 months
Since October 2015
*We have 6 groups whatsapp*
*for almost 600 marketers*
From all middle east
*since 5 years*
& now 10 more groups
For Marketing Club Lovers as future Marketers
أهم حاجة الشروط
*Only marketers*
From all Industries
No students
*No sales*
*No hotels Reps*
*No restaurants Reps*
*No Travel Agents*
*No Advertising Agencies*
*Many have asked to Attend the Club*
((We Wish All can Attend,But Cant..))
*Criteria of Marketing Club Members*
•••••••••••••••••••••••••••••••••••••
For Better Harmony & Mind set.
*Must be only Marketer*
*Also Previous Marketing experience*
●Business Managers
●Country Manager,GM
●Directors, CEO
Are most welcomed to add Value to us.
■■■■■■■■■■■■■■■■
《 *Unmatched Criteria*》
Not Med Rep,
Not Key Account,
Not Product Specialist,
Not Sales Supervisor,
Not Sales Manager,
●●●●●●●●●●●●●●●●●●
But till you become a marketer
you can join other What'sApp group
*Marketing Lover Future Club Group*
■■■■■■■■■■■■■■■■
《 *Unmatched Criteria*》
For Conflict of Intrest
*Also Can't attend*
If Working in
*Marketing Services Provider*
=not *Hotel* Marketers
=not *Restaurant* Marketers
=not *Advertising* Marketer
=not *Event Manager*
=not *Market Researcher*.
■■■■■■■■■■■■■■■■
■■■■■■■■■■■■■■■■
*this Club for Only Marketers*
Very Soon we will have
*Business Leaders Club*
For Sales Managers & Directors
Will be Not for Markters
●●●●●●●●●●●●●●●●●●●●
■ *Only Marketers* ■
*& EPS Marketing Diploma*
●●●●●●●●●●●●●●●●●●●●
Confirm coming by Pvt WhatsApp
*To know the new Location*
*#Mahmoud_Bahgat*
00966568654916
*#Marketing_Club*
http://goo.gl/forms/RfskGzDslP
*اشترك بصفحة جمعية الصيادلة المصريين* عالفيسبوك
https://lnkd.in/fucnv_5
■ *Bahgat Facbook Page*
https://lnkd.in/fVAdubA
■ *Bahgat Linkedin*
https://lnkd.in/fvDQXuG
■ *Bahgat Twitter*
https://lnkd.in/fmNC72T
■ *Bahgat YouTube Channel*
https://www.Youtube.com /mahmoud bahgat
■ *Bahgat Instagram*
https://lnkd.in/fmWPXrY
■ *Bahgat SnapChat*
https://lnkd.in/f6GR-mR
*#Mahmoud_Bahgat*
*#Legendary_ADLAND*
www.TheLegendary.info
• Forecasted the Expected Credit Loss, over the lifetime of the mortgage. Built Loan-level PD Model using Markov Chain Transition Matrix and logistic regression with six transition states and validated them using backtesting.
Reduction in customer complaints - Mortgage IndustryPranov Mishra
The project aims at analysis of Customer Complaints/Inquiries received by a US based mortgage (loan) servicing company..
The goal of the project is building a predictive model using the identified significant
contributors and coming up with recommendations for changes which will lead to
1. Reducing Re-work
2. Reducing Operational Cost
3. Improve Customer Satisfaction
4. Improve company preparedness to respond to customer.
Three models were built - Logistic Regression, Random Forest and Gradient Boosting. It was seen that the accuracy, auc (Area under the curve), sensitivity and specificity improved drastically as the model complexity increased from simple to complex.
Logistic regression was not generalizing well to a non-linear data. So the model was suffering from both bias and variance. Random Forest is an ensemble technique in itself and helps with reducing variance to a great extent. Gradient Boosting, with its sequential learning ability, helps reduce the bias. The results from both random forest and gradient boosting did not differ by much. This is confirming the bias-variance trade-off concept which states that complex models will do well on non-linear data as the inflexible simple models will have high bias and can have high variance.
Additionally, a lift chart was built which gives a Cumulative lift of 133% in the first four deciles
Sales Performance Deep Dive and Forecast: A ML Driven Analytics SolutionPranov Mishra
Problem Statement
One of Unilever’s brands is going through a steep decline in revenues and is requiring major changes in business execution plans. The management is expecting a thorough analysis of historical performances culminating in identification of key factors driving sales.
Data Summary and Product Life Cycle Overview
The data provided constituted more than 30 years of information of sales and related variables.
The training data suggested that the product has gone through a life-cycle of launch, growth and maturity. There were indications of a decline phase in the last few periods of training data.
The test data corroborated the indications as we could notice sharp decline (more than 25%) since 2016.
Key Insights & Driver Analysis
The factors having a significant positive impact on sales volumes were identified to be promotion expenditure, volumes produced or in stock, inflation, rainfall and visibility through social search impressions.
The factors having a significant negative impact on sales volumes were identified to be brand equity, competitor prices, fuel price and digital impressions
Forecasting
Multiple approaches were attempted including ARIMA, Holt Winter’s Double Exponential Smoothing, Bayesian approach(BSTS) and LSTM
The best results were achieved when training data was combined with 2 years of test data to capture the decline phases. MAPE of 25% achieved with Holt Winter followed by ARIMA with a mape of 33%.
For the second problem statement that required training on test data only, best results were achieved through the bsts model followed by LSTM. Mapes of 5% and 13% respectively were achieved.
Prediction of customer propensity to churn - Telecom IndustryPranov Mishra
The aim of this project is to help a telecom company with insights on customer behavior that would be useful for retention of customers. The specific goals expected to be achieved are given below
1. Identification of the top variables driving likelihood of churn
2. Build a predictive model to identify customers who have highest probability to terminate services with the company.
3. Build a lift chart for optimization of efforts by targeting most of the potential churns with least contact efforts. Here with 30% of the total customer pool, the model accurately provides 33% of total potential churn candidates.
Models tried to arrive at the best are
1. Simple Models like Logistic Regression & Discriminant Analysis with different thresholds for classification
2. Random Forest after balancing the dataset using Synthetic Minority Oversampling Technique (SMOTE)
3. Ensemble of five individual models and predicting the output by averaging the individual output probabilities
4. Xgboost algorithm
Default credit cards are an important issue that bring negative consequences to both sides, i.e, banks and customer. If a customer does not pay his obligations, banks loose money, the customer will lose credibility in future payments, collection calls start to be made and in last resort, the case may go into the court. In order to avoid all of that trouble, effective methods that are able to predict the default of credit cards are needed. Therefore, default credit card prediction is an important, challenging and useful task that should be addressed.
This presentation documents how the problem can be addressed, following the pipeline of a typical Patter Recognition application. The main task is to classify a set of samples representing the history of payments and bill statements of a given client plus some background information about the client according to its ability to pay or not (Default) the next monthly payment of its credit card.
This slide discuss predictive data analytics models and their applications in broader content. It gives simple examples of regression and classification.
3rd alex marketing club (pharmaceutical forecasting) dr. ahmed sham'aMahmoud Bahgat
#Mahmoud_Bahgat
#Marketing_Club
Join us by WhatsApp to me 00966568654916
*اشترك في صفحة ال Marketing Club* عالفيسبوك
https://www.facebook.com/MarketingTipsPAGE/
*اشترك في جروب ال Marketing Club* عالفيسبوك
https://www.facebook.com/groups/837318003074869/
*Marketing Club Middle East*
25 Meetings in 6 Cities in 1 year & 2 months
Since October 2015
*We have 6 groups whatsapp*
*for almost 600 marketers*
From all middle east
*since 5 years*
& now 10 more groups
For Marketing Club Lovers as future Marketers
أهم حاجة الشروط
*Only marketers*
From all Industries
No students
*No sales*
*No hotels Reps*
*No restaurants Reps*
*No Travel Agents*
*No Advertising Agencies*
*Many have asked to Attend the Club*
((We Wish All can Attend,But Cant..))
*Criteria of Marketing Club Members*
•••••••••••••••••••••••••••••••••••••
For Better Harmony & Mind set.
*Must be only Marketer*
*Also Previous Marketing experience*
●Business Managers
●Country Manager,GM
●Directors, CEO
Are most welcomed to add Value to us.
■■■■■■■■■■■■■■■■
《 *Unmatched Criteria*》
Not Med Rep,
Not Key Account,
Not Product Specialist,
Not Sales Supervisor,
Not Sales Manager,
●●●●●●●●●●●●●●●●●●
But till you become a marketer
you can join other What'sApp group
*Marketing Lover Future Club Group*
■■■■■■■■■■■■■■■■
《 *Unmatched Criteria*》
For Conflict of Intrest
*Also Can't attend*
If Working in
*Marketing Services Provider*
=not *Hotel* Marketers
=not *Restaurant* Marketers
=not *Advertising* Marketer
=not *Event Manager*
=not *Market Researcher*.
■■■■■■■■■■■■■■■■
■■■■■■■■■■■■■■■■
*this Club for Only Marketers*
Very Soon we will have
*Business Leaders Club*
For Sales Managers & Directors
Will be Not for Markters
●●●●●●●●●●●●●●●●●●●●
■ *Only Marketers* ■
*& EPS Marketing Diploma*
●●●●●●●●●●●●●●●●●●●●
Confirm coming by Pvt WhatsApp
*To know the new Location*
*#Mahmoud_Bahgat*
00966568654916
*#Marketing_Club*
http://goo.gl/forms/RfskGzDslP
*اشترك بصفحة جمعية الصيادلة المصريين* عالفيسبوك
https://lnkd.in/fucnv_5
■ *Bahgat Facbook Page*
https://lnkd.in/fVAdubA
■ *Bahgat Linkedin*
https://lnkd.in/fvDQXuG
■ *Bahgat Twitter*
https://lnkd.in/fmNC72T
■ *Bahgat YouTube Channel*
https://www.Youtube.com /mahmoud bahgat
■ *Bahgat Instagram*
https://lnkd.in/fmWPXrY
■ *Bahgat SnapChat*
https://lnkd.in/f6GR-mR
*#Mahmoud_Bahgat*
*#Legendary_ADLAND*
www.TheLegendary.info
Data reduction: breaking down large sets of data into more-manageable groups or segments that provide better insight.
- Data sampling
- Data cleaning
- Data transformation
- Data segmentation
- Dimension reduction
Random Forest Classification is a machine learning technique utilizing aggregated outcome of many decision tree classifiers in order to improve precision of the outcome. It measures the relationship between the categorical target variable and one or more independent variables.
Market Research using SPSS _ Edu4Sure Sept 2023.pptEdu4Sure
SPSS Training Related Content. There is practical training on the tool. The PPT is for reference purpose.
For any training need, kindly connect us at partner@edu4sure.com or call us at +91-9555115533.
For more courses at our LMS, you can also refer www.testformula.com
#Edu4Sure #SPSS #Training #Certificate
This report contains:-
1. what is data analytics, its usages, its types.
2. Tools used for data analytics
3. description of Classification
4. description of the association
5. description of clustering
6. decision tree, SVM modelling etc with example
Risk Product Management - Creating Safe Digital Experiences, Product School 2019Ramkumar Ravichandran
Sreekant Vijayakumar & I spoke at Product School in Dec 2019 on everything that goes into Risk Management at Digital Enterprises. First part focused on explaining why Risk Management is existential question for organizations today and not cost saving. Second part focuses on educating on the foundations of Risk Management and last part is how a real Risk Management Practice (Product Managers, Data Scientists, Engineers, Operations) is built & run in an organization.
Artificial Intelligence is here to stay and drastically improve our lives. However as with any emerging tech, there is been a FOMO rush to get something (AI-As-A-Brand) out which led to creation of AI products first and then looking for customers and problems to solve. Creating products that drive real impact at scale requires loving your "customers and their problems" instead of loving the "product that you created". It means commitment, persistence and humility to identify real customer needs, give your everything to meet it and learn & improve along the way. The framework of "Learn-Listen-Test" is perfectly to do this at scale and effectiveness by marrying together Reporting to monitor KPIs, Analytics to explain the reasons behind things, User Research to contextualize it and Experimentation to pick the best solution. AI Product leaders today became who they are by going back to the basics and learning their way to become integral part of our lives and we should emulate them as we think of our own products.
Presented at the DCD Mexico 2017. The digital era is characterized by the omnipresence of data and analytics across the value proposition of the organization from being a core offering to an add-on or as a competitive advantage or the optimization support. This has led to an Analytics that is a living & breathing organism, something that grows and changes with time - in the role it plays for the various stakeholders (which changes itself), the forms of delivery, the ownership and finally the size of impact. The "Analytics Maturity Curve" provides a guiding vision and framework for the Analytics programs across the industry. The presentation will focus on the evolution of "Analytics Maturity Curve" itself with time, the need for it, the challenges and finally the lessons learnt during the transition from one phase to another. The success criteria for this presentation is that the audience leaves with a perspective on what differentiates the programs that successfully made the transition and have a best practice checklist to refer to in their own journey.
This will be presented at the Optimizely's San Francisco User Group session on Oct 4th. As with any program, an A/B Testing Practice also follows a specific maturity curve. Since it is much more complex and spans across various domains and business units, it begins with a "Sell" phase focused on getting buy-in from various stakeholders but with a specific focus on Engineering & QA, followed by "Scale" phase with focus on building team, efficiency and program and then on to "Expand" phase focused on wider scope/complex tests and strengthen the platform, over to the "Deepen" phase where the focus is to ingrain testing within the company's DNA, i.e., within the backend/algorithms, cross pollinate learning and testing across various business units. The final phase is the "Sustain" phase where Algorithmic Test Management takes over Testing, and Testing is productized as a Value Add service for monetization and brand captial creation. We will walk the audience through our own journey so far along the maturity curve, the lessons learnt along the way, the challenges and what worked for us. The session will be rounded up with a working session with the audience on their own journey, lessons and advice for others.
This was presented at a Meet Up called Data & Analytics (DNA) at Raipur, India. It was organized by Ashutosh Tripathi of Krishna Public Schools heritage. The audience was the business leaders, students/aspirants, enablers and institutions. The focus was on helping audience understand how Analytics is more than just another fad - it's a weapon to drive better management, cultural transformation and quantifiable business impact. In other words, it's about delivering effective leadership via an actionable vision, guided execution and transformation management.
Augment the actionability of Analytics with the “Voice of Customer”Ramkumar Ravichandran
Currently Voice of Customer, Analytical & Testing are treated as distinct functions and managed across siloed systems, resulting in under realization of true potential of these systems. Some of the biggest complaints cited by user groups of these functions can easily be solved by just leveraging the power of one technique for the other, be it the need for reasoning for analytical findings, scale for research insights or unintended consequences in Testing. Integrating them closely with the ability to talk to each other, having the data pass-throughs and the ability for application servers to process and react to the insights from across these systems will help get a reasoned decision system. Together these disparate but rich data sources can also open up avenues for exploratory research internally and outside, which can also be monetized as actionable data products.
Predictive Analytics has stopped being an advanced analytics project that is done to gain competitive advantage. It is now the mainstay of every business and requires the ability to handle a wide variety of intricate types of problems, day in and day out, at an ever increasing pressure of RoI, at a scale previously unimagined and at speed previously unconceivable. As the current analytics maturity curves evolves to consider Machine Learning & Artificial Intelligence as integral components that an organization should aspire for, it requires predictive analyze imbibe the best of product practices- agility of development, iterative learning & developing, inter-operability and a simpler interface aka API. Having an API like framework helps Predictive Analytics seamlessly integrate with other analytical practices like A/B Testing, Research, fit within the final product offering and also help complement power of predictive analytics to answer what could happen based on not only what happened in the past, why it happened, the motivations/aspirations of customers and the engagement of customers with competitive offerings. This leads to a virtuous cycle of enhanced predictive power, easier integration with prescriptive framework, better actionability of insights and ability to tweak actions via Test & Learn Framework.
Prepping the Analytics organization for Artificial Intelligence evolutionRamkumar Ravichandran
This is a discussion document to be used at the Big Data Spain at Madrid on Nov 18th, 2016. The key takeaway from the deck is that AI is reality and much closer than we realize. It will impact our Analytics Community in a very different way vs. an average Consumer. We can shape and guide the revolution if we start preparing for it now - right from our mindset, design thinking principles and productization of Analytics (API-zation). AI is a need to address the problems of scale, speed, precision in the world that is getting more and more complex around us - it is not humanly possible to answer all the questions ourselves and we will need machines to do it for us. The flow of the story line begins with a reality check on popular misconceptions and some background on AI. It then delves into all the ways it can optimize the current flow and ends with the "Managing Innovation Playbook" a set of three steps that should guide our innovation programs - Strategy, Execution & Transformation, i.e., the principles that tell us what we want to get out of it, how to get it done and finally how much the benefits permanent and consistently improving.
Would love to hear your feedback, thoughts and reactions.
"Big Data, big data, big data" is all that anyone can think about today. It is the rage, it is the "in" thing, it is the "pill for all ills". People call it the new oil! It takes a moment to realize that it is gas that run automobile not raw oil. It requires taking a step back to realize that actionability can come from good reasoning, right analyses, incisive research and rigorous testing even if the data is small. Big data is useful in so many ways - statistically significant sample size, ability to manage "unknown-unknown" , micro targeting, etc. but it brings with it associated costs and noise too. This presentation is an attempt to bring back the conversation to quality of analytics, actionability of insights and confident decision making without dependency on complexity or volume of data. Analytics Value Chain is a framework where Strategic goals drive everything in data, analytics, research and testing with a quantifiable benefit on the bottom line. This was presented at Global Big Data Conference 2016 at Santa Clara.
Marketing is the face of the company, Marketing gives personality to the life that is the firm. Even though Marketing is a critical function, it has sometimes lagged in tapping true potential of analytics for good reasons. Marketing is a complex function with multiple moving parts and it is rather difficult to bring in too much control required for tracking, measuring and acting on the insights. However recent developments in big data, technology, awareness, analytical maturity and analytical techniques have made this easier. This deck is a discussion on practical challenges, potential opportunities and proposes an analytics value chain approach bringing together data, analytics, research and testing to inform and drive Strategy, manage execution and drive business impact with quantifiable business impact. This presentation was done at Digital Summit 2016 at Los Angeles.
Analytics is the hottest commodity on the job market today. Everyone wants to be an analyst and everyone wants analysis to inform their decisions. However barely scratching the surface reveals some disconnect between the Analyst community and their stakeholders ranging from expectations of actionability, to be able to understand the insights on the stakeholders side and the quality of problems being solved and the insights being acted up on from the analyst side. It leads to significant heartburn, churn and lost business opportunity. This presentation is a discussion on the drivers of the issues, possible solutions leveraging analytics and a framework for objective measurement of performance/contribution/action and growth & development. This was presented at TM Forum Live 2016 at Nice,France.
Analytics has proven itself to be a enabler of decisions, strategy and execution. But it is much more, it can help define and empower organizational culture. It can bring in transparency, accountability, collaboration, focus and objective pursuit of company vision and goals. This presentation was done at Customer Analytics & Insights Summit at Austin in Aug 2016.
Digital summit Dallas 2015 - Research brings back the 'human' aspect to insightsRamkumar Ravichandran
Every established firm needs engaged Consumers and brand loyalists and advocates - higher the share of loyal & engaged consumers, higher is the brand respect and business performance. Numbers are relatively inexpensive, quick, efficient and more direct way of understanding the engagement and drivers. However Research adds in the additional dimension of motivations/emotions driving such engagement. Only when we bring them together in a strategic way, can we truly appreciate our Customers & be able to offer them the best solutions & services.
Social media analytics - a delicious treat, but only when handled like a mast...Ramkumar Ravichandran
Social Media provides a wealth of insights into Brand's stand in the minds of consumers. It's usually unsolicited and represents true "connect" and if leveraged well as a channel can add a significant value addition to Consumer Engagement & Brand Management. However, easy it is not! It requires a well planned out strategy with right goals, the success criteria & a dedicated Social team. Reading it requires an "analyst" mindset, a strong technical setup and reacting to it requires strong business acumen. The slides tries to capture key considerations that should go into a Social Media Strategy.
Presented at the Product Management & Innovation Summit 2016 -a discussion on how insights derived from various analytical methods can help optimize decisions across the various stages in Product Life Cycle. Bringing them all together can help strategically prioritize development of features truly desired by Consumers, address issues quickly and capitalize on bigger opportunities.
Analytics has evolved from a support function into a Core Decision making tool. It provides unique capability of connecting the dots across organization & outside and leverage best practices/insights into making Decisions more actionable and outcomes predictable. With a top-down strategic view, iterative Test & Learn framework, hybrid team structure, context based User Experience Design, dual objective (Business & Learning) & recommendation/business case storytelling takes the Analytics deliverables into next level.
We propose a new needs driven framework for managing data with Data Lakes - Scalable Metrics Model. Salient features are modularity, extensiblity, flexibility and scalablity. We want to have self-contained modules which can either feed Reporting/Decision engines themselves with the capability of connecting across various other modules for Deep dive Analytics/Mining.
This will be presented at a Global Big Data Conference at Santa Clara on Sep 2nd. Come join us for a fun and learning event.
What makes insights from Analytics more/less actionable? -not always billion dollar revenue generation. Slides walk you through the various components that make it actionable - challenges & what can be done about them. It was presented at Text Analytics Summit NY 2015.
A/B Testing best practices from strategic vision to operational considerations to communication and finally expectations management. We need to adhere to fundamental project management, technology, statistical, experimental design, UX Design, Customer Relationship, business and data principles to ensure that the insights and hence the decision is as trustworthy as possible.
This talk was done at Business Analytics Innovation Summit 2015 @ Vegas on Jan 22nd. In this talk, we show problems with distributed Insight generation and the resultant problems. We recommend an Outcome Focused Framework for enabling Data Instrumentation, Data Management, Insight Generation and Open Analytics Platform.
Video used: https://www.youtube.com/channel/UCODSVC0WQws607clv0k8mQA/videos
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
2. What is Predictive Analytics?
Various way of doing it
Forecasting Techniques
Decision Trees
Regression
How to find out if a method works?
How to deploy them in real world?
When to do Predictive Analytics vs. not?
REFERENCES
Intended for Knowledge Sharing
only.
2
Intended for Knowledge Sharing
only. 2
CONTENTS
3. Intended for Knowledge Sharing
only.
3
Intended for Knowledge Sharing
only. 3
What is Predictive Analytics?
Prediction of future value of variable of interest(predicted) from past values of either
itself or other explanatory variables(predictor)…
eg. Stock price movements, credit card default rates, inventory management, etc.
Concepts of Time Windows..
Other time components..
• Trend – long term organic growth
• Seasonality – specific fluctuations repeating for certain time points(months, days) every year
• Development window (Jan’08 – Jun’10)
• Observe the predicted variable (stock price,
default rate, etc.) and /or get the relationship with
predictor variables
• Validation window (Jul’10 – Dec’10)
• Check if prediction accuracy within acceptable
limits
• If not, improve the prediction framework
• Prediction window (Jan’11 – May’11)
• Use the predictive method to get the projections
• Strategize business actions based on projections
0.00
0.02
0.04
0.06
0.08
0.10
0.12
Rev($Bn)
Development Window Validation
Window
Prediction
Window
4. Intended for Knowledge Sharing
only.
4
Intended for Knowledge Sharing
only. 4
Various ways of doing it
All methods can be grouped in three broad categories..
• Simple Forecasting Techniques
• Decision Trees
• Regression
Simple Forecasting Techniques:
• Moving Averages – Moving Averages over last ‘x’ months
• Decomposition Method – Tease out Trend and Seasonality components for use in predictions
• Holt Exponential Smoothing Techniques –Apply Trend and Seasonality to Exponential Averages.
Exponential Averages assign progressively lesser weights to older observations.
Decision Trees:
• Breaks down population into smaller buckets and predicts for each buckets. Yield much higher
prediction accuracy than simple forecasting techniques.
Regression:
• Establishes a mathematical relationship between ‘predicted’ and ‘predictor’, which can then be
used to predict future values from known values of ‘predictor’.
5. Intended for Knowledge Sharing
only.
5
Intended for Knowledge Sharing
only. 5
Simple Forecasting Techniques
Simplest method of forecasting but cannot explain why it predicts certain value...
Moving Averages:
Prediction(t) = Average(Value at t-1 to t-x)
For next month, shift average window by 1 month and so on.
Decomposition Method:
Prediction(t) = Trended value(T)*Seasonality Index(SI)
-> T= Actual value in last available month*Growth factor;
and Growth factor = (Actual(t) – Actual(t-1))/Actual(t-1)
-> SI = average of all Jan/average of all months;
SI has to be calculated separately for each of 12 months
and then SI relevant for “being predicted” month applied
Holt Exponential Smoothing:
Prediction(t) = (Smoothed series+ Trend(T))*SI
->Smoothed series = Smoothing Factor * Actual last month +
(1-Smoothing Factor) * Smoothed for last month and so on
350
400
450
500
550
600
650
700
#International airline passengers('000)
Actuals Moving Averages(12 months)
Decomposition Method Holt-Winters
6. • Begins with entire population and splits on ‘predicted’ variable(e.g., default rate) by a predictor variable,
e.g. Customer type – Subprime or Premium
• Checks if the difference in ‘predicted’(default rate) is statistically significant using Chi-square or t-test
• If the difference is significant, then it splits the nodes* by other variables,
• If not, it goes back and tries to ‘significantly’ split the population by another variable
How long does it keep splitting?
• Until it finds significant splits based on the Chi-square or t-tests
• Until it hits max number of nodes* (manageable number for business actions)
• When the counts in lower most nodes becomes less than 5%
*Each subgroup resulting from split is called a node
Intended for Knowledge Sharing
only.
6
Intended for Knowledge Sharing
only. 6
Decision Trees
Higher prediction accuracy and explain ability, since prediction is done at member sub-
groups level…
All Credit Card holders
Default rate: 2%
Sub-prime
Default rate: 5%
Premium
Default rate: 1%
FICO <250
Default rate: 8%
FICO: 250 to 400
Default rate: 6%
FICO>400
Default rate: 4%
Monthly spend <$500
Default rate: 0.5%
Monthly spend >$500
Default rate: 1.5%
nodes
7. • Estimates degree of relationship between the “predicted” variable and the “predictor” variables
e.g. Credit Card default = intercept + b1*bankruptcy +b2*payment to income ratio
->intercept – unexplained factor
->b1,b2– strength of relationship- how much “predicted”((default probability) changes with unit
changes in “predictor” values(bankruptcy or payment to income ratio)
What are the various types of regression?
Intended for Knowledge Sharing
only.
7
Intended for Knowledge Sharing
only. 7
Regression
Highest prediction accuracy and explain ability, since prediction is done at individual
member level…
Regression Methods
Linear Logistic ARIMA
When they should be used?
To predict value of a variable,
e.g., Credit Card spend, inventory
quantity
To predict probability of certain event
happening, e.g., credit card default, inventory
shortage
To predict future values from historical
figures, e.g., future stock price from
past figures
Inherent assumptions in the
technique
Predicted variable follows "normal
distribution" meaning population
has most members having about
average values and lesser counts
towards extremes
Probability of event happening follows
"binomial distribution" meaning probability
of observing 'x' defaulters by picking 'N'
members is highest if the proportion of
defaulters in population is (x/N)
‘Stationary time series’, i.e., the
structure of time series doesn’t change
significantly, i.e., increase in volatility
or change in growth rate itself
8. Intended for Knowledge Sharing
only.
8
Intended for Knowledge Sharing
only. 8
How to find if a method works?
Various measurement diagnostics can be used to check prediction accuracy…
• Root Mean Square Error (RMSE): Average difference between actual and predicted values.
RMSE = average of square(actual – predicted)
• Error rate(%): Tells what is the error relative to actual values of predicted variable.
Error rate (%) = RMSE/average of actuals
Decision Trees and Regression models have more sophisticated diagnostics…
• R-square: Tells how much of the variance in “predicted” variable is captured by the model.
• Rank Order: Checks if the predicted values correlate with actual values.
Steps:
• Sort the population by predicted values
• Split into groups with equal number of obs, generally ten groups or deciles
• Get the average of both actual and predicted values for each group
• Check if both averages are gradually decreasing from the top group to bottom
• Gains Chart: Useful mostly in logistic regression models. Tells if most of the defaulters are being captured in
top groups itself. If not, models aren’t giving highest probability to actual defaulters and so models needs to
be revisited.
• Akaike Information Criteria(AIC): Helps in selecting the most “parsimonious” regression models- maximum
information capture with least number of predictors.
9. Intended for Knowledge Sharing
only.
9
Intended for Knowledge Sharing
only. 9
How to deploy them in real world?
Simple Forecasting Techniques are used to predict at portfolio level only, e.g., predictions for
Auto-Lease portfolio’s loss rates
but both Decision Trees and Regression Models require separate infrastructure to get deployed
for real time/non-real time predictions…
Decision Trees is used as a “rule engine”. Every customer will fall into one of the nodes and the
prediction for that node is used to act on this customer’s request, e.g., Sub-prime customer with
FICO<250 will be targeted even when he is just 1 payment due, vs. a premium customer in high
customer will be given leverage to 4 payments due.
Regression Model gives “account level” estimates which are then used to act on customer’s
request, e.g., Fraud models, etc. Models have to run every time a customer transacts.
10. OPPORTUNITY SIZING
MONTH 1 MONTH 2 MONTH 3
FT 1 FT2
Is ROI
acceptable?
MIN COUNT
REQUIREMENTS
MMF
NO MMF
Is minimum
count
available?
REQUIRED ACCURACY
OF PREDICTION
Prediction
accuracy
unsatisfactory?
CONSTRAINTS EXPLANATION QUESTIONS
Intended for Knowledge Sharing
only. 10
When to do Predictive Analytics vs. not?
11. Intended for Knowledge Sharing
only. 11
REFERENCES
Simple Forecasting Techniques
http://itl.nist.gov/div898/handbook/pmc/section4/pmc4.htm
Binomial Distributions
http://www.itl.nist.gov/div898/handbook/eda/section3/eda366i.htm
Exponential Smoothing
http://forecasters.org/pdfs/foresight/free/Issue19_goodwin.pdf
Decision Trees
http://www.salford-systems.com/resources/whitepapers/index.html
Linear Regression
http://faculty.chass.ncsu.edu/garson/PA765/regress.htm
Logistic Regression
http://faculty.chass.ncsu.edu/garson/PA765/logistic.htm
ARIMA Regression(also called as Box-Jenkins methodology)
http://www.itl.nist.gov/div898/handbook/pmc/section4/pmc445.htm