Ai in finance

AI and Machine Learning in Finance
2019 Copyright QuantUniversity LLC.
Presented By:
Sri Krishnamurthy, CFA, CAP
sri@quantuniversity.com
www.analyticscertificate.com

3
• “AI is the theory and development of computer systems able to
perform tasks that traditionally have required human intelligence.
• AI is a broad field, of which ‘machine learning’ is a sub-category”
What is Machine Learning and AI?
Source: http://www.fsb.org/wp-content/uploads/P011117.pdf

4
Source: Mckinsey Report: An executive guide to AI

5

6
Machine Learning & AI in finance – A paradigm shift
Stochastic
Models
Factor Models
Optimization
Risk Factors
P/Q Quants
Derivative
pricing
Trading
Strategies
Simulations
Distribution
fitting
Quant
Real-time analytics
Predictive analytics
Machine Learning
RPA
NLP
Deep Learning
Computer Vision
Graph Analytics
Chatbots
Sentiment Analysis
Alternative Data
Data Scientist

7
A framework for evaluating your organization’s appetite for AI
and machine learning
Source: http://www.fsb.org/wp-content/uploads/P011117.pdf

8
Sizing the opportunity
Source: Mckinsey Report: Notes from the AI frontier: Insights from hundreds of use cases

10
Data
Cross
sectional
Numerical Categorical
Longitudinal
Numerical
Handling Data

11
Goal
Descriptive
Statistics
Cross
sectional
Numerical Categorical
Numerical vs
Categorical
Categorical vs
Categorical
Numerical vs
Numerical
Time series
Predictive
Analytics
Cross-
sectional
Segmentation Prediction
Predict a
number
Predict a
category
Time-series
Goal

12
Machine Learning Algorithms
Machine
Learning
Supervised
Prediction
Parametric
Linear
Regression
Neural
Networks
Non-
parametric
KNN Decision Trees
Classification
Parametric
Logistic
Regression
Neural
Networks
Non
Parametric
Decision Trees KNN
Unsupervised
algorithms
K-means
Associative
rule mining

13
The Process
Data
cleansing
Feature
Engineering
Training
and Testing
Model
building
Model
selection

14
Evaluating
Machine learning
algorithms
Supervised -
Prediction
R-square RMS MAE MAPE
Supervised-
Classification
Confusion Matrix ROC Curves
Evaluation framework

17
Claim:
• Machine learning is better for fraud
detection, looking for arbitrage
opportunities and trade execution
Caution:
• Beware of imbalanced class problems
• A model that gives 99% accuracy may still
not be good enough
1. Machine learning is not a generic solution to all problems

18
Claim:
• Our models work on
datasets we have tested on
Caution:
• Do we have enough data?
• How do we handle bias in
datasets?
• Beware of overfitting
• Historical Analysis is not
Prediction
2. A prototype model is not your production model

19
AI and Machine Learning in Production
https://www.itnews.com.au/news/hsbc-societe-generale-run-
into-ais-production-problems-477966
Kristy Roth from HSBC:
“It’s been somewhat easy - in a funny way - to
get going using sample data, [but] then you hit
the real problems,” Roth said.
“I think our early track record on PoCs or pilots
hides a little bit the underlying issues.
Matt Davey from Societe Generale:
“We’ve done quite a bit of work with RPA
recently and I have to say we’ve been a bit
disillusioned with that experience,”
“the PoC is the easy bit: it’s how you get that
into production and shift the balance”

20
Claim:
• It works. We don’t know how!
Caution:
• It’s still not a proven science
• Interpretability or “auditability” of
models is important
• Transparency in codebase is paramount
with the proliferation of opensource
tools
• Skilled data scientists who are
knowledgeable about algorithms and
their appropriate usage are key to
successful adoption
3. We are just getting started!

21
Claim:
• Machine Learning models are
more accurate than
traditional models
Caution:
• Is accuracy the right metric?
• How do we evaluate the
model? RMS or R2
• How does the model behave
in different regimes?
4. Choose the right metrics for evaluation

22
Claim:
• Machine Learning and AI will replace
humans in most applications
Caution:
• Beware of the hype!
• Just because it worked some times
doesn’t mean that the organization can
be on autopilot
• Will we have true AI or Augmented
Intelligence?
• Model risk and robust risk
management is paramount to the
success of the organization.
• We are just getting started!
5. Are we there yet?
https://www.bloomberg.com/news/articles/2017-10-20/automation-
starts-to-sweep-wall-street-with-tons-of-glitches

Credit risk in consumer credit
Credit-scoring models and techniques assess the risk in lending to
customers.
Typical decisions:
• Grant credit/not to new applicants
• Increasing/Decreasing spending limits
• Increasing/Decreasing lending rates
• What new products can be given to existing applicants ?

Credit assessment in consumer credit
History:
• Gut feel
• Social network
• Communities and influence
Traditional:
• Scoring mechanisms through credit bureaus
• Bank assessments through business rules
Newer approaches:
• Peer-to-Peer lending
• Prosper Market place

26
The Data
https://www.kaggle.com/wendykan/lending-club-loan-data

Dataset, variable and Observations
Dataset: A rectangular array with Rows as observations and
columns as variables
Variable: A characteristic of members of a population ( Age, State
etc.)
Observation: List of Variable values for a member of the
population

29
• Supervised Algorithms
▫ Given a set of variables !", predict the value of another variable # in a
given data set such that
▫ If y is numeric => Prediction
▫ If y is categorical => Classification
Machine Learning
x1,x2,x3… Model F(X) y

30
• Parametric models
▫ Assume some functional form
▫ Fit coefficients
• Examples : Linear Regression, Neural Networks
Supervised Learning models - Prediction
! = #$ + #&'&
Linear Regression Model Neural network Model

31
• Non-Parametric models
▫ No functional form assumed
• Examples : K-nearest neighbors, Decision Trees
Supervised Learning models
K-nearest neighbor Model Decision tree Model

32
The Workflow
Data
cleansing
Feature
Engineering
Training and
Testing
Model
building
Model
selection
Model
Deployment

33
• Automated machine learning (AutoML) is the process of
automating the end-to-end process of applying machine learning to
real-world problems.
AutoML

34
• Automated Feature Engineering
▫ Feature selection
▫ Feature extraction
▫ Meta learning and transfer learning
▫ Detection and handling of skewed data and/or missing values
• Hyper-parameter optimization
• Model Selection
• Reference:
https://en.wikipedia.org/wiki/Automated_machine_learning
Types of frameworks

35
• Parameters: Values that can be estimated from data
▫ Examples:
– Regression Coefficients
– Weights in a Neural Network
• HyperParameters: Values external to the model and cannot be
learnt from the data
▫ Examples:
– Learning rate in Neural Network
– Regularization parameters
Parameters vs Hyper Parameters

36
• Hyperparameter optimization finds a tuple of hyperparameters that yields an
optimal model which minimizes a predefined loss function on given
independent data.[1]
• [1] Claesen, Marc; Bart De Moor (2015). "Hyperparameter Search in Machine
Learning".
• Image from:
https://support.sas.com/resources/papers/proceedings17/SAS0514-2017.pdf
Hyperparameter optimization

37
• Interpretability: Ability of users to understand the model, the
parameters of the model and their effect on the outcome
• Example:
▫ In regression, coefficients enable us to interpret the influence of an
independent variable on the dependent variable.
▫ The standard error of estimates of the coefficients enable us to
determine how confident are we on these estimates
Model selection considerations

38
• Parsimonious models: A parsimonious model is a model that
accomplishes a desired level of explanation or prediction with as
few predictor variables as possible.
• Example:
▫ In regression, using Exhaustive search, Forward search, Backward
search or Stepwise regression in model selection
▫ Using PCA on the feature space prior to model building

39
• Ensemble models: Ensemble methods use multiple learning
algorithms to obtain better predictive performance than could be
obtained from any of the constituent learning algorithms alone.
Image from:
https://blogs.sas.com/content/subconsciousmusings/2017/05/18/sta
cked-ensemble-models-win-data-science-competitions/

40
Full pipeline Auotmation
• AutoWEKA is an approach for the simultaneous selection of a machine
learning algorithm and its hyperparameters; combined with
the WEKA package it automatically yields good models for a wide variety
of data sets.
• Auto-sklearn is an extension of AutoWEKA using the Python library scikit-
learn which is a drop-in replacement for regular scikit-learn classifiers and
regressors. It improves over AutoWEKA by using meta-learning to
increase search efficiency and post-hoc ensemble building to combine the
models generated during the hyperparameter optimization process.
• TPOT is a data-science assistant which optimizes machine learning
pipelines using genetic programming.
Ref: https://www.ml4aad.org/automl/
Frameworks

41
Hyper-parameter optimization and Model Selection
• H2O AutoML provides automated model selection and ensembling
for the H2O machine learning and data analytics platform.
• mlr is a R package that contains several hyperparameter
optimization techniques for machine learning problems.
Ref: https://www.ml4aad.org/automl/
Frameworks

42
Deep Neural Network Architecture search
• Google CLOUD AUTOML is an could-based machine learning service
which so far provides the automated generation of computer vision
pipelines.
• Auto Keras is an open-source python package for neural architecture
search.
• Ref:
▫ https://www.ml4aad.org/automl/
▫ https://en.wikipedia.org/wiki/Automated_machine_learning
Frameworks

44
Hardware Considerations
Reference: https://azure.microsoft.com/en-us/blog/release-
models-at-pace-using-microsoft-s-automl/

Sri Krishnamurthy, CFA, CAP
Founder and Chief Data Scientist
QuantUniversity LLC.
srikrishnamurthy
www.QuantUniversity.com
www.analyticscertificate.com
www.qusandbox.com
Information, data and drawings embodied in this presentation are strictly a property of QuantUniversity LLC. and shall not be
distributed or used in any other publication without the prior written consent of QuantUniversity LLC.
46

Ai in finance

More Related Content

What's hot

Similar to Ai in finance

More from QuantUniversity

Recently uploaded

Ai in finance