Data Science Projects
at BBVA
Demystifying Big Data / 2
Diego J. Bodas Sagi
Data Scientist at BBVA Data & Analytics
PhD. AI
MBA
PMP
MSc. in Mathematic
@DiegoBodasSagi
diegobodas@yahoo.es
Big Data Analytics at BBVA
BBVA Data & Analytics
The Analytic Center of Excellence of
BBVA (fully owned subsidiary)
Goal: to globally drive BBVA
transformation into a digital data-driven
business
45 people from 10 countries, 33%
women, 16 PhDs
Madrid - Barcelona - México D.F.
Data Science Projects at BBVA / 4
A Machine Learning perspective
Syllabus
01
02
03
The Practice
The Production
04
05
The Applications
The Implications
Data Science Projects at BBVA / 5
A Machine Learning perspective
01
Data Science Projects at BBVA / 6
Art by humans?
Why do we talk about Machine Learning today?
“The aim of art is to represent not
the outward appearance of things,
but their inward significance”
Aristotle
Data Science Projects at BBVA / 7
How to deliver value?
Data Science Projects at BBVA / 8
Defining objectives
DESIRABLE
NEEDS AND PROBLEMS TO BE
SOLVED
PROFITABLE
VALUE PERCEIVED BY
CUSTOMERS AND
COMPETITIVE ADVANTAGES
POSSIBLE
TECHNICAL FEASIBILITY,
CAPABILITIES, BUDGET...
Data Science Projects at BBVA / 9
1. Consumer financial management advice
2. Retailers management advice
3. Offer the best products to our customer
4. Help public administration: mobility, tourism, public policies, etc
1. Understanding economic environment
2. Avoid fraud
3. Better risk management
4. Improving process
5. Agile development
What are we working on?
Above the glass
(income)
Above the glass
(efficiencies)
Data Science Projects at BBVA /
10
Bad vs good questions
• What can be done with this data?
• Is this a relevant business problem
• Where can I find useful data to help me to solve
this problem?
Where is value created?
DATA TALENT
Pain point
● Finding REAL data scientists
DATA TALENT
Data Science Projects at BBVA / 13
Data Science Projects at BBVA / 14
Pain point
● DATA
○ Enough data?
○ Right data?
○ Timely data?
○ ...
DATA TALENT
Data Science Projects at BBVA /
16
Data governance is paramount
Data Science Projects at BBVA / 17
The myths
● A Machine Learning can be “self-sufficient”.
Machine learning is a co-pilot, not an
autopilot. A person is needed to make
judgment calls on the machine's output
● The more data the better… It depends! Take
into account quality and imbalanced
datasets
● AI is replacing humans. No, IA is
“augmenting” humans
Data Science Projects at BBVA / 18
Co – pilot…
Data Science Projects at BBVA / 19
Data Science Projects at BBVA / 20
Data Science Projects at BBVA / 21
Be careful with this chatbot
Ref: http://www.ticbeat.com/cyborgcultura/el-chatbot-de-microsoft-que-se-volvio-nazi/
Data Science Projects at BBVA / 22
The Practice
02
Simply applying Machine Learning algorithms to your data won’t work
Data Science Projects at BBVA /
23
Helping frameworks: design thinking
Data Science Projects at BBVA /
24
Helping frameworks: agile teams
Data Science Projects at BBVA / 25
Where Design Thinking meets Data Science
Start with a question, challenge,
opportunity
Form the
hypothesis
Prototype
Iterate
Explore solutions to
similar problems
Evaluate
Design the
dataset
Model
Production Validate
Document
Visualize
EvaluateExperience
Data
Data engine
Iterate
Articulate the key questions
Build a tangible vision of the solution with
priorities, goals and scope
Data Science Projects at BBVA / 26
Iterate and discover
Start with a question, challenge,
opportunity
Form the
hypothesis
Prototype
Iterate
Explore solutions to
similar problems
Evaluate
Design the
dataset
Model
Production Validate
Document
Visualize
EvaluateExperience
Data
Data engine
Iterate
Understand the limitations of the
algorithm, user testing
Share the insights from
quantitative exploration
Data Science Projects at BBVA / 27
Continuous improving
Start with a question, challenge,
opportunity
Form the
hypothesis
Prototype
Iterate
Explore solutions to
similar problems
Evaluate
Design the
dataset
Model
Production Validate
Document
Visualize
EvaluateExperience
Data
Data engine
Iterate
Evaluate the impact on the experience
Reformulate the objectives
Data Science Projects at BBVA / 28
The Production
03
Data Science Projects at BBVA / 29
The Technology
Storage
Programming
Data Science Projects at BBVA / 30
Stability vs Speed of Innovation
All systems are
working all the
time
All components are
changing all the
time
Data Science Projects at BBVA / 31
The Deployment
Prototype
Deployment
Monitoring
Improving
A clear path to production is required
Data Science Projects at BBVA /
32
The cost: machine learning is not for free
• Complex model and code
(glue code)
• Data dependencies
• Dealing with Changes in
the External World
Data Science Projects at BBVA / 33
The Applications
04
Deep Learning is a subfield of
machine learning concerned
with algorithms inspired by the
structure and function of the
brain called artificial neural
networks
Deep Learning
ML
DL
Evolution
Data Science Projects at BBVA /
35
Discussion
Ref: The Mythos of Model Interpretability by Zachary C. Lipton
https://arxiv.org/pdf/1606.03490.pdf
Data Science Projects at BBVA /
36
Always in mind
Data Science Projects at BBVA / 37
The Implications
05
Data Science Projects at BBVA /
38
General Lessons
• Get to know the problem domain
• Do not be afraid to start from scratch if your assumptions
are wrong
• Monitor quality continuously
• Beware of crowdsourcing
Data Science Projects at BBVA /
39
• Infrastructure (cost structure & Scalability)
• Learning curves change constantly and frequently
• A data science team has to be learning almost constantly
• Pay attention to motivation within the team
• Autonomy
• Competence
• Relatedness
• Bureaucracy, security, legal, norms... (work as one team)
Other key points
Data Science Projects at BBVA / 40
The Near Futures
Standards boots business
AI
NarrowGeneral
- Driven by scientist
- Multiple task
- Understanding
- Driven by industry
- One task
- Practical
Data Science Projects at BBVA / 41
The challenges
Data Science Projects at BBVA / 42
The Trust Challenge
Data Science Projects at BBVA /
43

BDAS-2017 | Lesson learned from the application of data science at BBVA

  • 1.
  • 2.
    Demystifying Big Data/ 2 Diego J. Bodas Sagi Data Scientist at BBVA Data & Analytics PhD. AI MBA PMP MSc. in Mathematic @DiegoBodasSagi diegobodas@yahoo.es
  • 3.
    Big Data Analyticsat BBVA BBVA Data & Analytics The Analytic Center of Excellence of BBVA (fully owned subsidiary) Goal: to globally drive BBVA transformation into a digital data-driven business 45 people from 10 countries, 33% women, 16 PhDs Madrid - Barcelona - México D.F.
  • 4.
    Data Science Projectsat BBVA / 4 A Machine Learning perspective Syllabus 01 02 03 The Practice The Production 04 05 The Applications The Implications
  • 5.
    Data Science Projectsat BBVA / 5 A Machine Learning perspective 01
  • 6.
    Data Science Projectsat BBVA / 6 Art by humans? Why do we talk about Machine Learning today? “The aim of art is to represent not the outward appearance of things, but their inward significance” Aristotle
  • 7.
    Data Science Projectsat BBVA / 7 How to deliver value?
  • 8.
    Data Science Projectsat BBVA / 8 Defining objectives DESIRABLE NEEDS AND PROBLEMS TO BE SOLVED PROFITABLE VALUE PERCEIVED BY CUSTOMERS AND COMPETITIVE ADVANTAGES POSSIBLE TECHNICAL FEASIBILITY, CAPABILITIES, BUDGET...
  • 9.
    Data Science Projectsat BBVA / 9 1. Consumer financial management advice 2. Retailers management advice 3. Offer the best products to our customer 4. Help public administration: mobility, tourism, public policies, etc 1. Understanding economic environment 2. Avoid fraud 3. Better risk management 4. Improving process 5. Agile development What are we working on? Above the glass (income) Above the glass (efficiencies)
  • 10.
    Data Science Projectsat BBVA / 10 Bad vs good questions • What can be done with this data? • Is this a relevant business problem • Where can I find useful data to help me to solve this problem?
  • 11.
    Where is valuecreated? DATA TALENT
  • 12.
    Pain point ● FindingREAL data scientists DATA TALENT
  • 13.
  • 14.
  • 15.
    Pain point ● DATA ○Enough data? ○ Right data? ○ Timely data? ○ ... DATA TALENT
  • 16.
    Data Science Projectsat BBVA / 16 Data governance is paramount
  • 17.
    Data Science Projectsat BBVA / 17 The myths ● A Machine Learning can be “self-sufficient”. Machine learning is a co-pilot, not an autopilot. A person is needed to make judgment calls on the machine's output ● The more data the better… It depends! Take into account quality and imbalanced datasets ● AI is replacing humans. No, IA is “augmenting” humans
  • 18.
    Data Science Projectsat BBVA / 18 Co – pilot…
  • 19.
  • 20.
  • 21.
    Data Science Projectsat BBVA / 21 Be careful with this chatbot Ref: http://www.ticbeat.com/cyborgcultura/el-chatbot-de-microsoft-que-se-volvio-nazi/
  • 22.
    Data Science Projectsat BBVA / 22 The Practice 02 Simply applying Machine Learning algorithms to your data won’t work
  • 23.
    Data Science Projectsat BBVA / 23 Helping frameworks: design thinking
  • 24.
    Data Science Projectsat BBVA / 24 Helping frameworks: agile teams
  • 25.
    Data Science Projectsat BBVA / 25 Where Design Thinking meets Data Science Start with a question, challenge, opportunity Form the hypothesis Prototype Iterate Explore solutions to similar problems Evaluate Design the dataset Model Production Validate Document Visualize EvaluateExperience Data Data engine Iterate Articulate the key questions Build a tangible vision of the solution with priorities, goals and scope
  • 26.
    Data Science Projectsat BBVA / 26 Iterate and discover Start with a question, challenge, opportunity Form the hypothesis Prototype Iterate Explore solutions to similar problems Evaluate Design the dataset Model Production Validate Document Visualize EvaluateExperience Data Data engine Iterate Understand the limitations of the algorithm, user testing Share the insights from quantitative exploration
  • 27.
    Data Science Projectsat BBVA / 27 Continuous improving Start with a question, challenge, opportunity Form the hypothesis Prototype Iterate Explore solutions to similar problems Evaluate Design the dataset Model Production Validate Document Visualize EvaluateExperience Data Data engine Iterate Evaluate the impact on the experience Reformulate the objectives
  • 28.
    Data Science Projectsat BBVA / 28 The Production 03
  • 29.
    Data Science Projectsat BBVA / 29 The Technology Storage Programming
  • 30.
    Data Science Projectsat BBVA / 30 Stability vs Speed of Innovation All systems are working all the time All components are changing all the time
  • 31.
    Data Science Projectsat BBVA / 31 The Deployment Prototype Deployment Monitoring Improving A clear path to production is required
  • 32.
    Data Science Projectsat BBVA / 32 The cost: machine learning is not for free • Complex model and code (glue code) • Data dependencies • Dealing with Changes in the External World
  • 33.
    Data Science Projectsat BBVA / 33 The Applications 04
  • 34.
    Deep Learning isa subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks Deep Learning ML DL Evolution
  • 35.
    Data Science Projectsat BBVA / 35 Discussion Ref: The Mythos of Model Interpretability by Zachary C. Lipton https://arxiv.org/pdf/1606.03490.pdf
  • 36.
    Data Science Projectsat BBVA / 36 Always in mind
  • 37.
    Data Science Projectsat BBVA / 37 The Implications 05
  • 38.
    Data Science Projectsat BBVA / 38 General Lessons • Get to know the problem domain • Do not be afraid to start from scratch if your assumptions are wrong • Monitor quality continuously • Beware of crowdsourcing
  • 39.
    Data Science Projectsat BBVA / 39 • Infrastructure (cost structure & Scalability) • Learning curves change constantly and frequently • A data science team has to be learning almost constantly • Pay attention to motivation within the team • Autonomy • Competence • Relatedness • Bureaucracy, security, legal, norms... (work as one team) Other key points
  • 40.
    Data Science Projectsat BBVA / 40 The Near Futures Standards boots business AI NarrowGeneral - Driven by scientist - Multiple task - Understanding - Driven by industry - One task - Practical
  • 41.
    Data Science Projectsat BBVA / 41 The challenges
  • 42.
    Data Science Projectsat BBVA / 42 The Trust Challenge
  • 43.