Goal for Client - Improve sales with data analysis.
NUVEEN:
- primarily markets and sells mutual funds through investment professionals like brokers, financial planners, and financial advisors.
- maintains rich dataset on sales transactions
- needs to acquire new clients cost-effectively, sell more to existing advisers, reduce redemptions (ADR – acquire, develop, retain)
Our Solution: use Supervised Machine Learning Modeling to assist Sales & Marketing in improving their targeting of financial advisers:
- Predict sales for 2019 using the data for 2018
- Estimate the probability of adviser adding a new fund in 2019
Data Driven Funds Sales Improvement - ML in Finances
1. Data Driven
Funds Sales
Improvement
CAPSTONE PROJECT BY
Dr. Larisa Golovko - CEO of Landviser, LLC
Applicant for Post Grad Diploma in ML/AI
(EMERITUS & Columbia University School of Engineering)
2. Client: NUVEEN
Nuveen was acquired by TIAA in 2014.
TIAA is committed to helping institutions and
individuals pursue positive outcomes through an
array of global, diversified financial services and a
long-term investment perspective.
https://landviser.maps.arcgis.com/
3. Goal: Improve Sales with Data Analysis
NUVEEN:
❑ primarily markets and sells mutual funds through investment professionals like brokers,
financial planners, and financial advisors.
❑ maintains rich dataset on sales transactions
❑ needs to acquire new clients cost-effectively, sell more to existing advisers, reduce
redemptions (ADR – acquire, develop, retain)
Our Goal: using Supervised Machine Learning Modeling assist Sales & Marketing
in improving their targeting of financial advisers:
✓ Predict sales for 2019 using the data for 2018
✓ Estimate the probability of adviser adding a new fund in 2019
4. “Essentially, all models are
wrong, but some are useful…”
- GEORGE BOX, BRITISH STATISTICIAN (1919-2013)
5. Model
Development
and Analytics is
Iterative Process
1. Understand Data
2. Clean Data
3. Test Models
4. Use in Business -> Get More Data
5. Back to #1
https://mspoweruser.com/microsoft-announces-team-data-science-
process-agile-methodology-improve-collaboration/
11. QA/QC
• Assuming all missing data are
“0” – as those are financial
transactions, fair assumption
• Advised to change all negative
values (especially in AUM) to
“0” – tried as #1 approach
• However, up to 25% of data on
more than 50% variables are
negative – forcing them to “0”
creates highly un-balanced
dataset
• Adopted #2 approach –
carefully replacing negatives
12. QC: Redemption – Should Be Negative?
ORIGINAL DATA: FEW POSITIVE RECORDS TRANSFORMED TO NEGATIVES
13. QC: Sales and AUM – Should Be Positive?
ORIGINAL DATA ALL CONVERTED TO ABSOLUTE VALUES
14. Clean Data for Further Analysis
➢ All 10,005 records
preserved
➢ Interactive Dashboard for
data exploration can be
deployed on website for
review by the client
16. Data Pipeline
1. Separate Predictor (X) and Target (y)
Variables
2. Prepare Continuous Numeric
y-Sales_2019 for Regression Model
3. Prepare Categorical (Binary)
y-NewFundAdded_2019
4. Separate Train-Test Data Sets 50-50
✓ Fit Various Models on Cleaned and Transformed Data
✓ For Model Persistence utilize Random State Parameter
17. Few Models Fitted on “0”- filled Data Set #1 Approach
RIDGE REGRESSION WITH LOG
TRANSFORMATION
Test Score= 0.10
LASSO REGRESSION (PROMISING)
Test Score=0.19
18. LASSO Regression Model Chosen to Predict
Sales Distribution
RIDGE REGRESSION WITH LOG
TRANSFORMATION
Train Set: 0.17 Test Set: 0.14
LASSO REGRESSION
Train Set: 0.52 Test Set: 0.37
#2 Approach – Careful Handling of Negatives -
improved prediction 100%
20. Probability of
Adding New
Fund in 2019
Classification Problem
Logistic Regression and Gradient
Boosting Classifier yielded similar
results – 76% Accuracy
21. By Targeting Top 20% Advisers New Funds
Sales Can Be Improved Two-Fold
22. Further Business
Improvements with
Data Science
Model can be pickled and quickly deployed on secure
company server to use on the new data (f.e. 2020
transactions to predict 2021 sales)
Model can be re-trained with additional/re-engineered
data (f.e. OneHotEncoder to include advisor Descriptive
variables in Regression Models)
LASSO Regression with Normalized (f.e. Power
Transformer) Target Sales variable likely to improve fit
Deploy
Improve logistics of outreach to advisors and their sales
performance prediction with Geo-Positioning* – if data on
their location available
Additional historical sales (Time Series*) – consider
including monthly transactions for previous 1-3 years
instead of whole year summary
New
Data
Evaluate municipal funds potential and risks (spatial*
and historical* analysis)
Use Markowitz Efficient Frontier Theory for Funds and
Securities Portfolio Development
Product
Dev
* indicate our core expertise
23. Thank You!
FOR THE EXCITING OPPORTUNITY TO WORK WITH REAL INVESTMENT
DATA OF NUVEEN AND UTILIZE MULTIPLE MACHINE LEARNING
TECHNIQUES LEARNED IN EMERITUS ML/AI COURSE
24. Connect with me @ LinkedIn.com/in/larisagolovko/ or email larisa@landviser.com
https://landviser.com/contact/