SlideShare a Scribd company logo
Adopting Data Science and Machine Learning in
the Enterprise
2018 Copyright QuantUniversity LLC.
Presented By:
Sri Krishnamurthy, CFA, CAP
sri@quantuniversity.com
www.analyticscertificate.com
2
About us:
• Data Science, Quant Finance and
Machine Learning Startup
• Technologies using MATLAB, Python
and R
• Programs
▫ Analytics Certificate Program
▫ Fintech programs
• Platform
• Founder of QuantUniversity LLC. and
www.analyticscertificate.com
• Advisory and Consultancy for Financial Analytics
• Prior Experience at MathWorks, Citigroup and
Endeca and 25+ financial services and energy
customers.
• Regular Columnist for the Wilmott Magazine
• Author of forthcoming book
“Financial Modeling: A case study approach”
published by Wiley
• Charted Financial Analyst and Certified Analytics
Professional
• Teaches Analytics in the Babson College MBA
program and at Northeastern University, Boston
Sri Krishnamurthy
Founder and CEO
3
4
https://quantuniversitycrashcourse.splashthat.com
Boston Fintech Week
5
AI and ML in Finance
6
Sentiments drives markets
7
How did we get here?
8
9
• “AI is the theory and development of computer systems able to
perform tasks that traditionally have required human intelligence.
• AI is a broad field, of which ‘machine learning’ is a sub-category”
What is Machine Learning and AI?
Source: http://www.fsb.org/wp-content/uploads/P011117.pdf
10
Machine Learning & AI in finance – A paradigm shift
Stochastic
Models
Factor Models
Optimization
Risk Factors
P/Q Quants
Derivative
pricing
Trading
Strategies
Simulations
Distribution
fitting
Quant
Real-time analytics
Predictive analytics
Machine Learning
RPA
NLP
Deep Learning
Computer Vision
Graph Analytics
Chatbots
Sentiment Analysis
Alternative Data
Data Scientist
11
The Virtuous Circle of Machine Learning and AI
Smart
Algorithms
Hardware
Data
12
The rise of Big Data and Data Science
Image Source: http://www.ibmbigdatahub.com/sites/default/files/infographic_file/4-Vs-of-big-data.jpg
13
Smarter Algorithms
Parallel and Distributing Computing Frameworks Deep Learning Frameworks
1. Our labeled datasets were thousands of times too
small.
2. Our computers were millions of times too slow.
3. We initialized the weights in a stupid way.
4. We used the wrong type of non-linearity.
- Geoff Hinton
“Capital One was able to determine fraudulent credit
card applications in 100 milliseconds”*
* http://go.databricks.com/hubfs/pdfs/Databricks-for-FinTech-170306.pdf
14
Hardware
15
A framework for evaluating your organization’s appetite for AI
and machine learning
Source: http://www.fsb.org/wp-content/uploads/P011117.pdf
16
17
Data
Cross
sectional
Numerical Categorical
Longitudinal
Numerical
Handling Data
18
Goal
Descriptive
Statistics
Cross
sectional
Numerical Categorical
Numerical vs
Categorical
Categorical vs
Categorical
Numerical vs
Numerical
Time series
Predictive
Analytics
Cross-
sectional
Segmentation Prediction
Predict a
number
Predict a
category
Time-series
Goal
19
Machine Learning Algorithms
Machine
Learning
Supervised
Prediction
Parametric
Linear
Regression
Neural
Networks
Non-
parametric
KNN Decision Trees
Classification
Parametric
Logistic
Regression
Neural
Networks
Non
Parametric
Decision Trees KNN
Unsupervised
algorithms
K-means
Associative
rule mining
20
The Process
Data
cleansing
Feature
Engineering
Training
and Testing
Model
building
Model
selection
21
Evaluating
Machine learning
algorithms
Supervised -
Prediction
R-square RMS MAE MAPE
Supervised-
Classification
Confusion Matrix ROC Curves
Evaluation framework
22
23
24
Claim:
• Machine learning is better for fraud
detection, looking for arbitrage
opportunities and trade execution
Caution:
• Beware of imbalanced class problems
• A model that gives 99% accuracy may still
not be good enough
1. Machine learning is not a generic solution to all problems
25
Claim:
• Our models work on
datasets we have tested on
Caution:
• Do we have enough data?
• How do we handle bias in
datasets?
• Beware of overfitting
• Historical Analysis is not
Prediction
2. A prototype model is not your production model
26
AI and Machine Learning in Production
https://www.itnews.com.au/news/hsbc-societe-generale-run-
into-ais-production-problems-477966
Kristy Roth from HSBC:
“It’s been somewhat easy - in a funny way - to
get going using sample data, [but] then you hit
the real problems,” Roth said.
“I think our early track record on PoCs or pilots
hides a little bit the underlying issues.
Matt Davey from Societe Generale:
“We’ve done quite a bit of work with RPA
recently and I have to say we’ve been a bit
disillusioned with that experience,”
“the PoC is the easy bit: it’s how you get that
into production and shift the balance”
27
Claim:
• It works. We don’t know how!
Caution:
• It’s still not a proven science
• Interpretability or “auditability” of
models is important
• Transparency in codebase is paramount
with the proliferation of opensource
tools
• Skilled data scientists who are
knowledgeable about algorithms and
their appropriate usage are key to
successful adoption
3. We are just getting started!
28
Claim:
• Machine Learning models are
more accurate than
traditional models
Caution:
• Is accuracy the right metric?
• How do we evaluate the
model? RMS or R2
• How does the model behave
in different regimes?
4. Choose the right metrics for evaluation
29
Claim:
• Machine Learning and AI will replace
humans in most applications
Caution:
• Beware of the hype!
• Just because it worked some times
doesn’t mean that the organization can
be on autopilot
• Will we have true AI or Augmented
Intelligence?
• Model risk and robust risk
management is paramount to the
success of the organization.
• We are just getting started!
5. Are we there yet?
https://www.bloomberg.com/news/articles/2017-10-20/automation-
starts-to-sweep-wall-street-with-tons-of-glitches
30
31
• Understanding sentiments in Earnings call transcripts
Goal
32
• Interpreting emotions
• Labeling data
Challenges
33
What is NLP ?
AI
Linguistics
Computer
Science
34
• Q/A
• Dialog systems - Chatbots
• Topic summarization
• Sentiment analysis
• Classification
• Keyword extraction - Search
• Information extraction – Prices, Dates, People etc.
• Tone Analysis
• Machine Translation
• Document comparison – Similar/Dissimilar
Sample applications
35
NLP in Finance
36
• If computers can understand language, opens huge possibilities
▫ Read and summarize
▫ Translate
▫ Describe what’s happening
▫ Understand commands
▫ Answer questions
▫ Respond in plain language
Language allows understanding
37
• Describe rules of grammar
• Describe meanings of words and their
relationships
• …including all the special cases
• ...and idioms
• ...and special cases for the idioms
• ...
• ...understand language!
Traditional language AI
https://en.wikipedia.org/wiki/Formal_language
38
What is NLP ?
Jumping NLP Curves
https://ieeexplore.ieee.org/document/6786458/
39
Q: What’s hard about writing programs
to understand text?
40
• Ambiguity:
▫ “ground”
▫ “jaguar”
▫ “The car hit the pole while it was moving”
▫ “One morning I shot an elephant in my pajamas. How he got into my
pajamas, I’ll never know.”
▫ “The tank is full of soldiers.”
“The tank is full of nitrogen.”
Language is hard to deal with
41
42
• Many ways to say the same thing
▫ “the same thing can be said in many ways”
▫ “language is versatile”
▫ “The same words can be arranged in many different ways to express
the same idea”
▫ …
Language is hard to deal with
43
• APIs
• Human Insight
• Expert Knowledge
• Build your own
Options?
44
NLP pipeline
Data Ingestion
from Edgar
Pre-Processing
Invoking APIs to
label data
Compare APIs
Build a new
model for
sentiment
Analysis
45
Building a model vs Deploying a model
QuSandbox- The platform for adopting Data
Science and AI in the Enterprise
2018 Copyright QuantUniversity LLC.
47
• QuSandbox, is an end-to-end workflow based system to enable
creation and deployment of data science workflows within the
enterprise for primarily ML and AI applications.
• Our environment supports AWS and Google Cloud platform and
incorporates model and data provenance throughout the life cycle
of model development.
• The solution can also be hosted on-prem to leverage custom
hardware and software integrations.
Executive Summary
48
The reproducibility challenge
49
What’s needed for reproducibility
Code Data
Environment Process
50
Prototype
Standardize
workflow
Productionize
and share
Model Management with QuSandbox
51
QuSandbox solution suite for ML/AI applications
Model
Analytics
Studio
QuSandbox
Research
hub
52
Reference points
54
• The regulatory sandbox allows businesses to test innovative
products, services, business models and delivery mechanisms in the
real market, with real consumers.
• The sandbox is a supervised space, open to both authorized and
unauthorized firms, that provides firms with:
▫ reduced time-to-market at potentially lower cost
▫ appropriate consumer protection safeguards built in to new products and
services
▫ better access to finance
• https://www.fca.org.uk/firms/regulatory-sandbox
Regulatory Sandboxes
55
Quant/Enterprise use cases
• Create an environment that can support multiple platforms and
programming languages
• Enable remote running of applications
• Ability to try out a Github submission/ someone else’s code
• Facilitate creation of Docker images to create replicable containers
• Create prototyping environments for Data Science/Quant teams
• Enable Data scientists/Quants to deploy their solutions
• Enable running multiple tasks and jobs
• Enable concurrent running of multiple experiments
• Integrate seamlessly with the cloud to scale up computations
Use cases
56
Fintech use cases
• To demonstrate solutions to enterprises
• Create customized enterprise trials for companies that don’t permit
installation of vendor software prior to procurement
• To manage quick updates
• Enable effective integration and hosting of services (REST APIs)
• To deploy custom services on QuSandbox
Use cases
57
Academic use cases
• Enable creation of course material and exercises that could be
shared
• Enable students and workshop participants to focus on the data
science experiments rather than environment setting
Use cases
58
ResearchHub
59
Research hub - Process
60
ResearchHub – CLI
61
QuSandbox - Experiment
62
Model Management Studio
63
JDF- DSL
64
QuSandbox
65
QuSandbox – Explore
66
Creating replicable environments
Creating and manage replicable environments (Code + software + data) in a single portal
67
Creating replicable environments
Create replicable environments (Code + software + data) through a easy point & click tool and
publish to Dockerhub or manage internally
Share it with target users
68
User portal
• Run multiple experiments in pre-created environments (Code + software + data)
• Deploy your own solutions
• Run any Docker image or Github submission on the cloud
69
Run Jupyter notebooks and prototype applications
70
Run Rstudio and Shiny applications
71
Run any Docker application
72
Manage tasks and errors
73
User portal
• Dockerize and deploy applications on AWS in just a few steps
74
Deploy applications with ease
75
Open source project
76
www.analyticscertificate.com/NLP
77
www.QuSandbox.com
Sri Krishnamurthy, CFA, CAP
Founder and Chief Data Scientist
QuantUniversity LLC.
srikrishnamurthy
www.QuantUniversity.com
www.analyticscertificate.com
www.qusandbox.com
Information, data and drawings embodied in this presentation are strictly a property of QuantUniversity LLC. and shall not be
distributed or used in any other publication without the prior written consent of QuantUniversity LLC.
78

More Related Content

What's hot

Nlp workshop-share
Nlp workshop-shareNlp workshop-share
Nlp workshop-share
QuantUniversity
 
No, you don't need to learn python
No, you don't need to learn pythonNo, you don't need to learn python
No, you don't need to learn python
QuantUniversity
 
Machine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and ApplicationsMachine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and Applications
QuantUniversity
 
Ml master class cfa poland
Ml master class   cfa polandMl master class   cfa poland
Ml master class cfa poland
QuantUniversity
 
Blockchain workshop Intro
Blockchain workshop IntroBlockchain workshop Intro
Blockchain workshop Intro
QuantUniversity
 
QuSandbox+NVIDIA Rapids
QuSandbox+NVIDIA RapidsQuSandbox+NVIDIA Rapids
QuSandbox+NVIDIA Rapids
QuantUniversity
 
CFA-NY Workshop - Final slides
CFA-NY Workshop - Final slidesCFA-NY Workshop - Final slides
CFA-NY Workshop - Final slides
QuantUniversity
 
Ai in finance
Ai in financeAi in finance
Ai in finance
QuantUniversity
 
10 Key Considerations for AI/ML Model Governance
10 Key Considerations for AI/ML Model Governance10 Key Considerations for AI/ML Model Governance
10 Key Considerations for AI/ML Model Governance
QuantUniversity
 
achine Learning and Model Risk
achine Learning and Model Riskachine Learning and Model Risk
achine Learning and Model Risk
QuantUniversity
 
Ml master class northeastern university
Ml master class   northeastern universityMl master class   northeastern university
Ml master class northeastern university
QuantUniversity
 
21st century quant
21st century quant21st century quant
21st century quant
QuantUniversity
 
Qu speaker series:Ethical Use of AI in Financial Markets
Qu speaker series:Ethical Use of AI in Financial MarketsQu speaker series:Ethical Use of AI in Financial Markets
Qu speaker series:Ethical Use of AI in Financial Markets
QuantUniversity
 
Ml master class
Ml master classMl master class
Ml master class
QuantUniversity
 
QCon conference 2019
QCon conference 2019QCon conference 2019
QCon conference 2019
QuantUniversity
 
Careers in analytics
Careers in analyticsCareers in analytics
Careers in analytics
QuantUniversity
 
Machine learning for factor investing
Machine learning for factor investingMachine learning for factor investing
Machine learning for factor investing
QuantUniversity
 
Algorithmic auditing 1.0
Algorithmic auditing 1.0Algorithmic auditing 1.0
Algorithmic auditing 1.0
QuantUniversity
 
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
QuantUniversity
 
Python for Data science
Python for Data sciencePython for Data science
Python for Data science
QuantUniversity
 

What's hot (20)

Nlp workshop-share
Nlp workshop-shareNlp workshop-share
Nlp workshop-share
 
No, you don't need to learn python
No, you don't need to learn pythonNo, you don't need to learn python
No, you don't need to learn python
 
Machine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and ApplicationsMachine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and Applications
 
Ml master class cfa poland
Ml master class   cfa polandMl master class   cfa poland
Ml master class cfa poland
 
Blockchain workshop Intro
Blockchain workshop IntroBlockchain workshop Intro
Blockchain workshop Intro
 
QuSandbox+NVIDIA Rapids
QuSandbox+NVIDIA RapidsQuSandbox+NVIDIA Rapids
QuSandbox+NVIDIA Rapids
 
CFA-NY Workshop - Final slides
CFA-NY Workshop - Final slidesCFA-NY Workshop - Final slides
CFA-NY Workshop - Final slides
 
Ai in finance
Ai in financeAi in finance
Ai in finance
 
10 Key Considerations for AI/ML Model Governance
10 Key Considerations for AI/ML Model Governance10 Key Considerations for AI/ML Model Governance
10 Key Considerations for AI/ML Model Governance
 
achine Learning and Model Risk
achine Learning and Model Riskachine Learning and Model Risk
achine Learning and Model Risk
 
Ml master class northeastern university
Ml master class   northeastern universityMl master class   northeastern university
Ml master class northeastern university
 
21st century quant
21st century quant21st century quant
21st century quant
 
Qu speaker series:Ethical Use of AI in Financial Markets
Qu speaker series:Ethical Use of AI in Financial MarketsQu speaker series:Ethical Use of AI in Financial Markets
Qu speaker series:Ethical Use of AI in Financial Markets
 
Ml master class
Ml master classMl master class
Ml master class
 
QCon conference 2019
QCon conference 2019QCon conference 2019
QCon conference 2019
 
Careers in analytics
Careers in analyticsCareers in analytics
Careers in analytics
 
Machine learning for factor investing
Machine learning for factor investingMachine learning for factor investing
Machine learning for factor investing
 
Algorithmic auditing 1.0
Algorithmic auditing 1.0Algorithmic auditing 1.0
Algorithmic auditing 1.0
 
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
Machine Learning: Considerations for Fairly and Transparently Expanding Acces...
 
Python for Data science
Python for Data sciencePython for Data science
Python for Data science
 

Similar to Adopting Data Science and Machine Learning in the financial enterprise

HR Analytics: Using Machine Learning to Predict Employee Turnover - Matt Danc...
HR Analytics: Using Machine Learning to Predict Employee Turnover - Matt Danc...HR Analytics: Using Machine Learning to Predict Employee Turnover - Matt Danc...
HR Analytics: Using Machine Learning to Predict Employee Turnover - Matt Danc...
Sri Ambati
 
Ds for finance day 4
Ds for finance day 4Ds for finance day 4
Ds for finance day 4
QuantUniversity
 
Qu for India - QuantUniversity FundRaiser
Qu for India  - QuantUniversity FundRaiserQu for India  - QuantUniversity FundRaiser
Qu for India - QuantUniversity FundRaiser
QuantUniversity
 
Algorithm Marketplace and the new "Algorithm Economy"
Algorithm Marketplace and the new "Algorithm Economy"Algorithm Marketplace and the new "Algorithm Economy"
Algorithm Marketplace and the new "Algorithm Economy"
Diego Oppenheimer
 
Regtech in Fintech + QuSandbox Demo
Regtech in Fintech + QuSandbox DemoRegtech in Fintech + QuSandbox Demo
Regtech in Fintech + QuSandbox Demo
QuantUniversity
 
Scaling Training Data for AI Applications
Scaling Training Data for AI ApplicationsScaling Training Data for AI Applications
Scaling Training Data for AI Applications
Applause
 
ML and AI in Finance: Master Class
ML and AI in Finance: Master ClassML and AI in Finance: Master Class
ML and AI in Finance: Master Class
QuantUniversity
 
Machine learning specialist ver#4
Machine learning specialist ver#4Machine learning specialist ver#4
Machine learning specialist ver#4
EPSILON AI INSTITUTE
 
Architecting for Data Science
Architecting for Data ScienceArchitecting for Data Science
Architecting for Data Science
Johann Schleier-Smith
 
So Now You’re a UiPath Developer – What’s Next?” What Role do You Play as Dev...
So Now You’re a UiPath Developer – What’s Next?” What Role do You Play as Dev...So Now You’re a UiPath Developer – What’s Next?” What Role do You Play as Dev...
So Now You’re a UiPath Developer – What’s Next?” What Role do You Play as Dev...
DianaGray10
 
Algorithmic auditing 1.0
Algorithmic auditing 1.0Algorithmic auditing 1.0
Algorithmic auditing 1.0
QuantUniversity
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
Miroslaw Staron
 
Career options in Artificial Intelligence : 2020
Career options in Artificial Intelligence : 2020Career options in Artificial Intelligence : 2020
Career options in Artificial Intelligence : 2020
Venkatarangan Thirumalai
 
Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101
QuantUniversity
 
[DevDay2018] High quality mindset in software development - By: Phat Vu, Scru...
[DevDay2018] High quality mindset in software development - By: Phat Vu, Scru...[DevDay2018] High quality mindset in software development - By: Phat Vu, Scru...
[DevDay2018] High quality mindset in software development - By: Phat Vu, Scru...
DevDay.org
 
AI improves software testing by Kari Kakkonen at TQS
AI improves software testing by Kari Kakkonen at TQSAI improves software testing by Kari Kakkonen at TQS
AI improves software testing by Kari Kakkonen at TQS
Kari Kakkonen
 
Introducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoIntroducción al Machine Learning Automático
Introducción al Machine Learning Automático
Sri Ambati
 
ML master class
ML master classML master class
ML master class
QuantUniversity
 
influence of AI in IS
influence of AI in ISinfluence of AI in IS
influence of AI in IS
ISACA Riyadh
 
Think Big | Enterprise Artificial Intelligence
Think Big | Enterprise Artificial IntelligenceThink Big | Enterprise Artificial Intelligence
Think Big | Enterprise Artificial Intelligence
Data Science Milan
 

Similar to Adopting Data Science and Machine Learning in the financial enterprise (20)

HR Analytics: Using Machine Learning to Predict Employee Turnover - Matt Danc...
HR Analytics: Using Machine Learning to Predict Employee Turnover - Matt Danc...HR Analytics: Using Machine Learning to Predict Employee Turnover - Matt Danc...
HR Analytics: Using Machine Learning to Predict Employee Turnover - Matt Danc...
 
Ds for finance day 4
Ds for finance day 4Ds for finance day 4
Ds for finance day 4
 
Qu for India - QuantUniversity FundRaiser
Qu for India  - QuantUniversity FundRaiserQu for India  - QuantUniversity FundRaiser
Qu for India - QuantUniversity FundRaiser
 
Algorithm Marketplace and the new "Algorithm Economy"
Algorithm Marketplace and the new "Algorithm Economy"Algorithm Marketplace and the new "Algorithm Economy"
Algorithm Marketplace and the new "Algorithm Economy"
 
Regtech in Fintech + QuSandbox Demo
Regtech in Fintech + QuSandbox DemoRegtech in Fintech + QuSandbox Demo
Regtech in Fintech + QuSandbox Demo
 
Scaling Training Data for AI Applications
Scaling Training Data for AI ApplicationsScaling Training Data for AI Applications
Scaling Training Data for AI Applications
 
ML and AI in Finance: Master Class
ML and AI in Finance: Master ClassML and AI in Finance: Master Class
ML and AI in Finance: Master Class
 
Machine learning specialist ver#4
Machine learning specialist ver#4Machine learning specialist ver#4
Machine learning specialist ver#4
 
Architecting for Data Science
Architecting for Data ScienceArchitecting for Data Science
Architecting for Data Science
 
So Now You’re a UiPath Developer – What’s Next?” What Role do You Play as Dev...
So Now You’re a UiPath Developer – What’s Next?” What Role do You Play as Dev...So Now You’re a UiPath Developer – What’s Next?” What Role do You Play as Dev...
So Now You’re a UiPath Developer – What’s Next?” What Role do You Play as Dev...
 
Algorithmic auditing 1.0
Algorithmic auditing 1.0Algorithmic auditing 1.0
Algorithmic auditing 1.0
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
 
Career options in Artificial Intelligence : 2020
Career options in Artificial Intelligence : 2020Career options in Artificial Intelligence : 2020
Career options in Artificial Intelligence : 2020
 
Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101
 
[DevDay2018] High quality mindset in software development - By: Phat Vu, Scru...
[DevDay2018] High quality mindset in software development - By: Phat Vu, Scru...[DevDay2018] High quality mindset in software development - By: Phat Vu, Scru...
[DevDay2018] High quality mindset in software development - By: Phat Vu, Scru...
 
AI improves software testing by Kari Kakkonen at TQS
AI improves software testing by Kari Kakkonen at TQSAI improves software testing by Kari Kakkonen at TQS
AI improves software testing by Kari Kakkonen at TQS
 
Introducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoIntroducción al Machine Learning Automático
Introducción al Machine Learning Automático
 
ML master class
ML master classML master class
ML master class
 
influence of AI in IS
influence of AI in ISinfluence of AI in IS
influence of AI in IS
 
Think Big | Enterprise Artificial Intelligence
Think Big | Enterprise Artificial IntelligenceThink Big | Enterprise Artificial Intelligence
Think Big | Enterprise Artificial Intelligence
 

More from QuantUniversity

EU Artificial Intelligence Act 2024 passed !
EU Artificial Intelligence Act 2024 passed !EU Artificial Intelligence Act 2024 passed !
EU Artificial Intelligence Act 2024 passed !
QuantUniversity
 
Managing-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdf
Managing-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdfManaging-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdf
Managing-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdf
QuantUniversity
 
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALSPYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
QuantUniversity
 
Ml master class for CFA Dallas
Ml master class for CFA DallasMl master class for CFA Dallas
Ml master class for CFA Dallas
QuantUniversity
 
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
QuantUniversity
 
Seeing what a gan cannot generate: paper review
Seeing what a gan cannot generate: paper reviewSeeing what a gan cannot generate: paper review
Seeing what a gan cannot generate: paper review
QuantUniversity
 
AI Explainability and Model Risk Management
AI Explainability and Model Risk ManagementAI Explainability and Model Risk Management
AI Explainability and Model Risk Management
QuantUniversity
 
Machine Learning in Finance: 10 Things You Need to Know in 2021
Machine Learning in Finance: 10 Things You Need to Know in 2021Machine Learning in Finance: 10 Things You Need to Know in 2021
Machine Learning in Finance: 10 Things You Need to Know in 2021
QuantUniversity
 
Bayesian Portfolio Allocation
Bayesian Portfolio AllocationBayesian Portfolio Allocation
Bayesian Portfolio Allocation
QuantUniversity
 
The API Jungle
The API JungleThe API Jungle
The API Jungle
QuantUniversity
 
Explainable AI Workshop
Explainable AI WorkshopExplainable AI Workshop
Explainable AI Workshop
QuantUniversity
 
Constructing Private Asset Benchmarks
Constructing Private Asset BenchmarksConstructing Private Asset Benchmarks
Constructing Private Asset Benchmarks
QuantUniversity
 
Machine Learning Interpretability
Machine Learning InterpretabilityMachine Learning Interpretability
Machine Learning Interpretability
QuantUniversity
 
Responsible AI in Action
Responsible AI in ActionResponsible AI in Action
Responsible AI in Action
QuantUniversity
 
Qu speaker series 14: Synthetic Data Generation in Finance
Qu speaker series 14: Synthetic Data Generation in FinanceQu speaker series 14: Synthetic Data Generation in Finance
Qu speaker series 14: Synthetic Data Generation in Finance
QuantUniversity
 
Qwafafew meeting 5
Qwafafew meeting 5Qwafafew meeting 5
Qwafafew meeting 5
QuantUniversity
 
Fintech in the Post-Covid Age
Fintech in the Post-Covid AgeFintech in the Post-Covid Age
Fintech in the Post-Covid Age
QuantUniversity
 
Master Class: GANS with Applications in Synthetic Data Generation
Master Class:   GANS with  Applications in  Synthetic Data GenerationMaster Class:   GANS with  Applications in  Synthetic Data Generation
Master Class: GANS with Applications in Synthetic Data Generation
QuantUniversity
 
Qwafafew meeting 4
Qwafafew meeting 4Qwafafew meeting 4
Qwafafew meeting 4
QuantUniversity
 
Synthetic data in finance
Synthetic data in financeSynthetic data in finance
Synthetic data in finance
QuantUniversity
 

More from QuantUniversity (20)

EU Artificial Intelligence Act 2024 passed !
EU Artificial Intelligence Act 2024 passed !EU Artificial Intelligence Act 2024 passed !
EU Artificial Intelligence Act 2024 passed !
 
Managing-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdf
Managing-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdfManaging-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdf
Managing-the-Risks-of-LLMs-in-FS-Industry-Roundtable-TruEra-QuantU.pdf
 
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALSPYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
PYTHON AND DATA SCIENCE FOR INVESTMENT PROFESSIONALS
 
Ml master class for CFA Dallas
Ml master class for CFA DallasMl master class for CFA Dallas
Ml master class for CFA Dallas
 
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
Towards Fairer Datasets: Filtering and Balancing the Distribution of the Peop...
 
Seeing what a gan cannot generate: paper review
Seeing what a gan cannot generate: paper reviewSeeing what a gan cannot generate: paper review
Seeing what a gan cannot generate: paper review
 
AI Explainability and Model Risk Management
AI Explainability and Model Risk ManagementAI Explainability and Model Risk Management
AI Explainability and Model Risk Management
 
Machine Learning in Finance: 10 Things You Need to Know in 2021
Machine Learning in Finance: 10 Things You Need to Know in 2021Machine Learning in Finance: 10 Things You Need to Know in 2021
Machine Learning in Finance: 10 Things You Need to Know in 2021
 
Bayesian Portfolio Allocation
Bayesian Portfolio AllocationBayesian Portfolio Allocation
Bayesian Portfolio Allocation
 
The API Jungle
The API JungleThe API Jungle
The API Jungle
 
Explainable AI Workshop
Explainable AI WorkshopExplainable AI Workshop
Explainable AI Workshop
 
Constructing Private Asset Benchmarks
Constructing Private Asset BenchmarksConstructing Private Asset Benchmarks
Constructing Private Asset Benchmarks
 
Machine Learning Interpretability
Machine Learning InterpretabilityMachine Learning Interpretability
Machine Learning Interpretability
 
Responsible AI in Action
Responsible AI in ActionResponsible AI in Action
Responsible AI in Action
 
Qu speaker series 14: Synthetic Data Generation in Finance
Qu speaker series 14: Synthetic Data Generation in FinanceQu speaker series 14: Synthetic Data Generation in Finance
Qu speaker series 14: Synthetic Data Generation in Finance
 
Qwafafew meeting 5
Qwafafew meeting 5Qwafafew meeting 5
Qwafafew meeting 5
 
Fintech in the Post-Covid Age
Fintech in the Post-Covid AgeFintech in the Post-Covid Age
Fintech in the Post-Covid Age
 
Master Class: GANS with Applications in Synthetic Data Generation
Master Class:   GANS with  Applications in  Synthetic Data GenerationMaster Class:   GANS with  Applications in  Synthetic Data Generation
Master Class: GANS with Applications in Synthetic Data Generation
 
Qwafafew meeting 4
Qwafafew meeting 4Qwafafew meeting 4
Qwafafew meeting 4
 
Synthetic data in finance
Synthetic data in financeSynthetic data in finance
Synthetic data in finance
 

Recently uploaded

哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 

Recently uploaded (20)

哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 

Adopting Data Science and Machine Learning in the financial enterprise

  • 1. Adopting Data Science and Machine Learning in the Enterprise 2018 Copyright QuantUniversity LLC. Presented By: Sri Krishnamurthy, CFA, CAP sri@quantuniversity.com www.analyticscertificate.com
  • 2. 2 About us: • Data Science, Quant Finance and Machine Learning Startup • Technologies using MATLAB, Python and R • Programs ▫ Analytics Certificate Program ▫ Fintech programs • Platform
  • 3. • Founder of QuantUniversity LLC. and www.analyticscertificate.com • Advisory and Consultancy for Financial Analytics • Prior Experience at MathWorks, Citigroup and Endeca and 25+ financial services and energy customers. • Regular Columnist for the Wilmott Magazine • Author of forthcoming book “Financial Modeling: A case study approach” published by Wiley • Charted Financial Analyst and Certified Analytics Professional • Teaches Analytics in the Babson College MBA program and at Northeastern University, Boston Sri Krishnamurthy Founder and CEO 3
  • 5. 5 AI and ML in Finance
  • 7. 7 How did we get here?
  • 8. 8
  • 9. 9 • “AI is the theory and development of computer systems able to perform tasks that traditionally have required human intelligence. • AI is a broad field, of which ‘machine learning’ is a sub-category” What is Machine Learning and AI? Source: http://www.fsb.org/wp-content/uploads/P011117.pdf
  • 10. 10 Machine Learning & AI in finance – A paradigm shift Stochastic Models Factor Models Optimization Risk Factors P/Q Quants Derivative pricing Trading Strategies Simulations Distribution fitting Quant Real-time analytics Predictive analytics Machine Learning RPA NLP Deep Learning Computer Vision Graph Analytics Chatbots Sentiment Analysis Alternative Data Data Scientist
  • 11. 11 The Virtuous Circle of Machine Learning and AI Smart Algorithms Hardware Data
  • 12. 12 The rise of Big Data and Data Science Image Source: http://www.ibmbigdatahub.com/sites/default/files/infographic_file/4-Vs-of-big-data.jpg
  • 13. 13 Smarter Algorithms Parallel and Distributing Computing Frameworks Deep Learning Frameworks 1. Our labeled datasets were thousands of times too small. 2. Our computers were millions of times too slow. 3. We initialized the weights in a stupid way. 4. We used the wrong type of non-linearity. - Geoff Hinton “Capital One was able to determine fraudulent credit card applications in 100 milliseconds”* * http://go.databricks.com/hubfs/pdfs/Databricks-for-FinTech-170306.pdf
  • 15. 15 A framework for evaluating your organization’s appetite for AI and machine learning Source: http://www.fsb.org/wp-content/uploads/P011117.pdf
  • 16. 16
  • 18. 18 Goal Descriptive Statistics Cross sectional Numerical Categorical Numerical vs Categorical Categorical vs Categorical Numerical vs Numerical Time series Predictive Analytics Cross- sectional Segmentation Prediction Predict a number Predict a category Time-series Goal
  • 19. 19 Machine Learning Algorithms Machine Learning Supervised Prediction Parametric Linear Regression Neural Networks Non- parametric KNN Decision Trees Classification Parametric Logistic Regression Neural Networks Non Parametric Decision Trees KNN Unsupervised algorithms K-means Associative rule mining
  • 21. 21 Evaluating Machine learning algorithms Supervised - Prediction R-square RMS MAE MAPE Supervised- Classification Confusion Matrix ROC Curves Evaluation framework
  • 22. 22
  • 23. 23
  • 24. 24 Claim: • Machine learning is better for fraud detection, looking for arbitrage opportunities and trade execution Caution: • Beware of imbalanced class problems • A model that gives 99% accuracy may still not be good enough 1. Machine learning is not a generic solution to all problems
  • 25. 25 Claim: • Our models work on datasets we have tested on Caution: • Do we have enough data? • How do we handle bias in datasets? • Beware of overfitting • Historical Analysis is not Prediction 2. A prototype model is not your production model
  • 26. 26 AI and Machine Learning in Production https://www.itnews.com.au/news/hsbc-societe-generale-run- into-ais-production-problems-477966 Kristy Roth from HSBC: “It’s been somewhat easy - in a funny way - to get going using sample data, [but] then you hit the real problems,” Roth said. “I think our early track record on PoCs or pilots hides a little bit the underlying issues. Matt Davey from Societe Generale: “We’ve done quite a bit of work with RPA recently and I have to say we’ve been a bit disillusioned with that experience,” “the PoC is the easy bit: it’s how you get that into production and shift the balance”
  • 27. 27 Claim: • It works. We don’t know how! Caution: • It’s still not a proven science • Interpretability or “auditability” of models is important • Transparency in codebase is paramount with the proliferation of opensource tools • Skilled data scientists who are knowledgeable about algorithms and their appropriate usage are key to successful adoption 3. We are just getting started!
  • 28. 28 Claim: • Machine Learning models are more accurate than traditional models Caution: • Is accuracy the right metric? • How do we evaluate the model? RMS or R2 • How does the model behave in different regimes? 4. Choose the right metrics for evaluation
  • 29. 29 Claim: • Machine Learning and AI will replace humans in most applications Caution: • Beware of the hype! • Just because it worked some times doesn’t mean that the organization can be on autopilot • Will we have true AI or Augmented Intelligence? • Model risk and robust risk management is paramount to the success of the organization. • We are just getting started! 5. Are we there yet? https://www.bloomberg.com/news/articles/2017-10-20/automation- starts-to-sweep-wall-street-with-tons-of-glitches
  • 30. 30
  • 31. 31 • Understanding sentiments in Earnings call transcripts Goal
  • 32. 32 • Interpreting emotions • Labeling data Challenges
  • 33. 33 What is NLP ? AI Linguistics Computer Science
  • 34. 34 • Q/A • Dialog systems - Chatbots • Topic summarization • Sentiment analysis • Classification • Keyword extraction - Search • Information extraction – Prices, Dates, People etc. • Tone Analysis • Machine Translation • Document comparison – Similar/Dissimilar Sample applications
  • 36. 36 • If computers can understand language, opens huge possibilities ▫ Read and summarize ▫ Translate ▫ Describe what’s happening ▫ Understand commands ▫ Answer questions ▫ Respond in plain language Language allows understanding
  • 37. 37 • Describe rules of grammar • Describe meanings of words and their relationships • …including all the special cases • ...and idioms • ...and special cases for the idioms • ... • ...understand language! Traditional language AI https://en.wikipedia.org/wiki/Formal_language
  • 38. 38 What is NLP ? Jumping NLP Curves https://ieeexplore.ieee.org/document/6786458/
  • 39. 39 Q: What’s hard about writing programs to understand text?
  • 40. 40 • Ambiguity: ▫ “ground” ▫ “jaguar” ▫ “The car hit the pole while it was moving” ▫ “One morning I shot an elephant in my pajamas. How he got into my pajamas, I’ll never know.” ▫ “The tank is full of soldiers.” “The tank is full of nitrogen.” Language is hard to deal with
  • 41. 41
  • 42. 42 • Many ways to say the same thing ▫ “the same thing can be said in many ways” ▫ “language is versatile” ▫ “The same words can be arranged in many different ways to express the same idea” ▫ … Language is hard to deal with
  • 43. 43 • APIs • Human Insight • Expert Knowledge • Build your own Options?
  • 44. 44 NLP pipeline Data Ingestion from Edgar Pre-Processing Invoking APIs to label data Compare APIs Build a new model for sentiment Analysis
  • 45. 45 Building a model vs Deploying a model
  • 46. QuSandbox- The platform for adopting Data Science and AI in the Enterprise 2018 Copyright QuantUniversity LLC.
  • 47. 47 • QuSandbox, is an end-to-end workflow based system to enable creation and deployment of data science workflows within the enterprise for primarily ML and AI applications. • Our environment supports AWS and Google Cloud platform and incorporates model and data provenance throughout the life cycle of model development. • The solution can also be hosted on-prem to leverage custom hardware and software integrations. Executive Summary
  • 49. 49 What’s needed for reproducibility Code Data Environment Process
  • 51. 51 QuSandbox solution suite for ML/AI applications Model Analytics Studio QuSandbox Research hub
  • 53. 54 • The regulatory sandbox allows businesses to test innovative products, services, business models and delivery mechanisms in the real market, with real consumers. • The sandbox is a supervised space, open to both authorized and unauthorized firms, that provides firms with: ▫ reduced time-to-market at potentially lower cost ▫ appropriate consumer protection safeguards built in to new products and services ▫ better access to finance • https://www.fca.org.uk/firms/regulatory-sandbox Regulatory Sandboxes
  • 54. 55 Quant/Enterprise use cases • Create an environment that can support multiple platforms and programming languages • Enable remote running of applications • Ability to try out a Github submission/ someone else’s code • Facilitate creation of Docker images to create replicable containers • Create prototyping environments for Data Science/Quant teams • Enable Data scientists/Quants to deploy their solutions • Enable running multiple tasks and jobs • Enable concurrent running of multiple experiments • Integrate seamlessly with the cloud to scale up computations Use cases
  • 55. 56 Fintech use cases • To demonstrate solutions to enterprises • Create customized enterprise trials for companies that don’t permit installation of vendor software prior to procurement • To manage quick updates • Enable effective integration and hosting of services (REST APIs) • To deploy custom services on QuSandbox Use cases
  • 56. 57 Academic use cases • Enable creation of course material and exercises that could be shared • Enable students and workshop participants to focus on the data science experiments rather than environment setting Use cases
  • 58. 59 Research hub - Process
  • 65. 66 Creating replicable environments Creating and manage replicable environments (Code + software + data) in a single portal
  • 66. 67 Creating replicable environments Create replicable environments (Code + software + data) through a easy point & click tool and publish to Dockerhub or manage internally Share it with target users
  • 67. 68 User portal • Run multiple experiments in pre-created environments (Code + software + data) • Deploy your own solutions • Run any Docker image or Github submission on the cloud
  • 68. 69 Run Jupyter notebooks and prototype applications
  • 69. 70 Run Rstudio and Shiny applications
  • 70. 71 Run any Docker application
  • 72. 73 User portal • Dockerize and deploy applications on AWS in just a few steps
  • 77. Sri Krishnamurthy, CFA, CAP Founder and Chief Data Scientist QuantUniversity LLC. srikrishnamurthy www.QuantUniversity.com www.analyticscertificate.com www.qusandbox.com Information, data and drawings embodied in this presentation are strictly a property of QuantUniversity LLC. and shall not be distributed or used in any other publication without the prior written consent of QuantUniversity LLC. 78