2. Banco Savings Bank
Zihang – Chief Executive Officer
Megan – Lead Data Analyst
Chuo – Chief Marketing Officer
Jiaqi – Data Scientist
Peter – Chief Creative Consultant
3. Business Problem
What do existing subscribers have in common?
Who is likely to subscribe to a term deposit
account?
What can be done by telemarketers to increase
effectiveness?
4. Process Flow
CLUSTER ANALYSIS
Profile Clients
LOGISTIC REGRESSION
Individual Performance
DECISION TREE
IF/THEN Rules
Group Clients
Becoming More Specific
5. Understand the 41K records we have
41,188 records in 16 variables (5 numerical, 11 categorical)
Customer Information
- Age
- Job
- Marital status
- Education
- Default
- Housing loan
- Personal loan
Current Campaign Information
- Communication type
- Month of last contact
- Day of week of last
contact
- Duration
- Campaign: number of
contacts
Past Campaign Information
- Pdays: days has pasted
- Previous: number of
contacts
- Poutcome
6. 24% Success rate in previous campaigns
Only 11% of customers have subscribed the deposit accounts
7. The outcome is promising
63% have middle-to-high income
They have money for a deposit account82% don’t have personal loans
Only 3 have credit in default
8. 2 Findings About Campaign
● Catch interests in 4 minutes and 18 seconds
● Reconsider the strategy ---- someone has been contacted for 56 times
9. Find Natural Patterns
Cluster Analysis
● Used to identify natural groups among a set of clients, based on a set of numerical variables.
Numerical Variables: Age, Duration, Campaign, Previous
Data
Manipulation
Decide #
Clusters
Cluster Detail
11. Classify the subscriber and non-subscriber
Decision Tree
● Use a set of “IF/THEN” rules to predict who will or will not subscribe the long term deposit.
Why Decision Tree?
● Categorical Variables
● Easy to interpret
Data
Manipulation
Classify
Customers
Group Detail
12. Important Variables Of Decision Tree
Accuracy Rate : 81%
Important Variables
Duration: Between 205 and 493 second or longer than 493 second
Month: October, December, March, April, September
Contact: Cellular
13. Predict whether a potential client will subscribe
Logistic Regression
● Binary Classifier System
Why Logistic Regression
● Response variable~(“Yes”, “No”)
● Outcome is probabilities between 0 and 1
Cleanup Explore Develop
14. High Subscription Attributes
High Subscription Attributes
● Job: housemaids, retired, students or unemployed
● Contact: Cellular
● Month: December, March, October, and September
● Duration: Longer Duration, Higher Probability
● Campaign: More contacts, lower Probability
Accuracy Rate : 90%
15. Summary of Results
Recall the Results
Model Important Attributes
Cluster Analysis ● Duration
● Campaign
Decision Tree ● Duration: >205s
● Month: March, April, September, October, December
● Contact: Cellular
Logistic
Regression
● Job: Housemaids, Retired, Students or Unemployed
● Contact: Cellular
● Month: March, September, October, and December
● Duration (+)
● Campaign (-)
MoreSpecific
16. Recommendation
Campaign: Use cellular to contact potential customersCellular
Findings Recommendations
Launch campaigns in these months.
December/
March/ October/
September
Students/Retired/
Housemaids/
Unemployed
Create unique message for each type of customers.
Be prepared for every call
Longer Duration/
Less Contact
The Portuguese banking institution is promoting its term deposit accounts with a direct marketing campaign (phone call). There are records of its current customers of all services. The portuguese banking institution would like to know
Profile -> Group -> Individual
Interpretation:
● 52% Blurring
● Slightly Difference
● Combine Cluster 1 and 2 Or More variables provided