ANDREA GALLEGO
CTO and Principal, Gamma
BCG
The Complexity to “Yes” in Analytics
Software and the Possibilities with
Docker and Containers
v
• Intro to BCG – Gamma
• How Gamma innovates Docker
• What’s coming Next
• Q&A
Agenda
v
BCG has evolved
Strategic and Technical Excellence
Technology
Office
Digital Ventures
Global team
focusing on IT
implementation and
risk management
services
Global team of
World class
Business
Consultants
Global team of
World Class
Data Scientists
Global team of
World class
Tech experts
Cost efficient
launch,
industrialize and
run
Business relevance
Change management
Impact
Best in class
Advanced
Data Analytics
On the job enable-
ment of client
teams
Tech excellence on
all fronts
Architecture
Privacy, security,
Governance
Leading Analytics
firm with 3500 FTEs
Large scale
enablement and
building of teams
250 FTE worldwide
(IT architects)
Strong execution
and implementation
capabilities
Innovation centers
and hatches in 9
locations
~500 specialists:
innovators,
operators,
entrepreneurs and
investors
Copyright©2017byTheBostonConsultingGroup,Inc.Allrightsreserved.
Data
Scientists
Design and
Engineering
Business
Consultants
Artificial intelligence
Statistics
Machine learning
Consulting industry
Domain experience in sectors and
functions
Expertise in scaling and integrating
analytics into production
Specialized in analytics
and productionizationAdvanced degrees
4-15 years experience
Academia
Industry
Consulting
Value realization focus
Operationalize analytics
business transformation
Enterprise analytics software
Scalable machine learning
Value driven analytics architecture
and design
BCG Gamma: a network of 500+ advanced
analytics practitioners, consultants and
software developers
Computer Science
Software development
Machine learning engineering
Visualization
Distributed processing
Copyright©2017byTheBostonConsultingGroup,Inc.Allrightsreserved.
GammaX – Gamma’s newest endeavor
• 30 engineers specializing in analytics
software engineering, data engineering,
UX design, distributed systems, and
machine learning engineering
• Homegrown proprietary software to
scale and productionize analytics
models to enable and make data
scientists more efficient
• Exclusive partnerships with top
technology companies in data
management, Devops, cloud technology
and machine learning
• SLA based technical support from our
dedicated team of experts
• Enablement and training for your teams
on today’s cutting edge data and
analytics technologies
• World class design team from HBO, and
Warner Bros.
Copyright©2017byTheBostonConsultingGroup,Inc.Allrightsreserved.
Gamma delivers end to end analytics
transformation
Pressure test one
use case
•Set ambition
•Define & evaluate
specifications
•Assess data
quality &
accessibility
Make go / no go
decision
Launch MVP in market
and improve through
test and learn
•Run agile sprints to test
solution “in-market”
and learn how to
improve
•Design and test new
ways of working
•Run technology in
controlled environment
Commit to scale-up
Incubation phasePrototype phase
Inflection phase
Exponential phase
Build customized
Proof of Concept to
validate business
case and feasibility
•Backtest on
historical data
•Confirm value
•Put first brick of
technology in
place
Agree on plan to
incubate
Value creation
Scale up solution,
transform organization,
increase value impact
•Run technology and
business process at
scale
•Analytics resources/
governance in place
•Teams trained
•Client capability to
own full solution in
place
P&L neutral in first 12
months, with exponential
growth beyond
Articulated case for value
capture
Tangible prototype with
business case and plan to
execute
MVP with impact
assessment and scaling
plan
Full scale solution
integrated into
environment
New ways of working
instilled in your team
Outcomes
Copyright©2017byTheBostonConsultingGroup,Inc.Allrightsreserved.
But that is hard with just power point pages
Copyright©2017byTheBostonConsultingGroup,Inc.Allrightsreserved.
So we started thinking of ways to deliver
high quality ML / AI at scale
Could we make it possible to:
Run models on any kind of infrastructure without worrying about the setup.
Access clean structured data easily and integrate it with models.
Provide clients access to analytics output on demand.
Test models against production grade standards before sending them to clients.
Deploy lightweight web services to showcase work or model output
Make sure data and the models were always secure.
Share model code and outputs with other data scientists and work from a common
environment with the exact same dependencies.
Deliver models at edge with real time orchestration, monitoring and model updates
Docker Journey in building
Gamma’s software
How we made Docker work for us
in the ML/AI @ Scale game
Users can use PyCharm or
Jupyter to write code, publish
analytics models and explore
data
Source uses Pachyderm and
proprietary code to let Data
scientists
easily explore and
manipulate data all through
the Source user interface
Source allows data scientists to
share production models with
clients by running production
tests and building and shipping
Docker containers to the client
site
Users never worry about
passwords or credentials.
Source and Hashicorp Vault
manage this seamlessly behind
the scenes for all systems
Source’s proprietary core
allows any data scientist or
client to create this entire
system with the PUSH of a
button. Source has a code base
that will launch and generate
this environment on client site1
1
2
4
3
5
1 Source has some dependencies client systems need to have so Source will install successfully
..and we built our own system
Source Beta architecture
ML Engine VPC
Front End Cluster
Case N Cluster
JSON + Vault Creds
Case Specific Secure Data
Pipelines Data Repos
Case Data
Object Store
(Encrypted)
Data Scientist Specific Notebooks
Private Subnets
Amazon ECS
Amazon ECS
At the beginning we worked with “vanilla”
Docker
Now we have tuned and optimized DTR for us
Front end Services
19
Circle CI Enterprise
controller and Circle CI
Services
44
Private Terraform
Enterprise
6
UCP and DTR with
replication
75
Applications
51
and growing
21 3
4 5
Metrics: 200+ BAU containers : 20+ containers with
each new user application
Each application takes on
17 or more new containers
depending on data size
Docker Journey in
building Source
What’s next for us
Production Environments
Docker Trusted Registry- External
Docker UCP
Production Environments
Version Control
Docker UCP
Internal Production EnvironmentsSource Platform Development
Development Client
Client A Datacenter
Client B Datacenter
Docker Trusted Registry - Internal
Docker for
Image signing across the entire software supply
chain for ourselves and our clients
Docker UCP
Production
Analytics deployment at Edge
Secure Client Container Trusted Registry
Monitoring, orchestration, scale
Analytics Package
(managed by Source)
Nodes
Analytics
Package
Nodes
Analytics Package
Manager Application
Node
E D G E
Source Edge Deployment Service
Edge Command Center
Source core platform
Thank you

The Complexity to "Yes" in Analytics Software and the Possibilities with Docker and Containers

  • 1.
    ANDREA GALLEGO CTO andPrincipal, Gamma BCG The Complexity to “Yes” in Analytics Software and the Possibilities with Docker and Containers
  • 2.
    v • Intro toBCG – Gamma • How Gamma innovates Docker • What’s coming Next • Q&A Agenda
  • 3.
  • 4.
    Strategic and TechnicalExcellence Technology Office Digital Ventures Global team focusing on IT implementation and risk management services Global team of World class Business Consultants Global team of World Class Data Scientists Global team of World class Tech experts Cost efficient launch, industrialize and run Business relevance Change management Impact Best in class Advanced Data Analytics On the job enable- ment of client teams Tech excellence on all fronts Architecture Privacy, security, Governance Leading Analytics firm with 3500 FTEs Large scale enablement and building of teams 250 FTE worldwide (IT architects) Strong execution and implementation capabilities Innovation centers and hatches in 9 locations ~500 specialists: innovators, operators, entrepreneurs and investors
  • 5.
    Copyright©2017byTheBostonConsultingGroup,Inc.Allrightsreserved. Data Scientists Design and Engineering Business Consultants Artificial intelligence Statistics Machinelearning Consulting industry Domain experience in sectors and functions Expertise in scaling and integrating analytics into production Specialized in analytics and productionizationAdvanced degrees 4-15 years experience Academia Industry Consulting Value realization focus Operationalize analytics business transformation Enterprise analytics software Scalable machine learning Value driven analytics architecture and design BCG Gamma: a network of 500+ advanced analytics practitioners, consultants and software developers Computer Science Software development Machine learning engineering Visualization Distributed processing
  • 6.
    Copyright©2017byTheBostonConsultingGroup,Inc.Allrightsreserved. GammaX – Gamma’snewest endeavor • 30 engineers specializing in analytics software engineering, data engineering, UX design, distributed systems, and machine learning engineering • Homegrown proprietary software to scale and productionize analytics models to enable and make data scientists more efficient • Exclusive partnerships with top technology companies in data management, Devops, cloud technology and machine learning • SLA based technical support from our dedicated team of experts • Enablement and training for your teams on today’s cutting edge data and analytics technologies • World class design team from HBO, and Warner Bros.
  • 7.
    Copyright©2017byTheBostonConsultingGroup,Inc.Allrightsreserved. Gamma delivers endto end analytics transformation Pressure test one use case •Set ambition •Define & evaluate specifications •Assess data quality & accessibility Make go / no go decision Launch MVP in market and improve through test and learn •Run agile sprints to test solution “in-market” and learn how to improve •Design and test new ways of working •Run technology in controlled environment Commit to scale-up Incubation phasePrototype phase Inflection phase Exponential phase Build customized Proof of Concept to validate business case and feasibility •Backtest on historical data •Confirm value •Put first brick of technology in place Agree on plan to incubate Value creation Scale up solution, transform organization, increase value impact •Run technology and business process at scale •Analytics resources/ governance in place •Teams trained •Client capability to own full solution in place P&L neutral in first 12 months, with exponential growth beyond Articulated case for value capture Tangible prototype with business case and plan to execute MVP with impact assessment and scaling plan Full scale solution integrated into environment New ways of working instilled in your team Outcomes
  • 8.
  • 9.
    Copyright©2017byTheBostonConsultingGroup,Inc.Allrightsreserved. So we startedthinking of ways to deliver high quality ML / AI at scale Could we make it possible to: Run models on any kind of infrastructure without worrying about the setup. Access clean structured data easily and integrate it with models. Provide clients access to analytics output on demand. Test models against production grade standards before sending them to clients. Deploy lightweight web services to showcase work or model output Make sure data and the models were always secure. Share model code and outputs with other data scientists and work from a common environment with the exact same dependencies. Deliver models at edge with real time orchestration, monitoring and model updates
  • 10.
    Docker Journey inbuilding Gamma’s software How we made Docker work for us in the ML/AI @ Scale game
  • 11.
    Users can usePyCharm or Jupyter to write code, publish analytics models and explore data Source uses Pachyderm and proprietary code to let Data scientists easily explore and manipulate data all through the Source user interface Source allows data scientists to share production models with clients by running production tests and building and shipping Docker containers to the client site Users never worry about passwords or credentials. Source and Hashicorp Vault manage this seamlessly behind the scenes for all systems Source’s proprietary core allows any data scientist or client to create this entire system with the PUSH of a button. Source has a code base that will launch and generate this environment on client site1 1 2 4 3 5 1 Source has some dependencies client systems need to have so Source will install successfully ..and we built our own system
  • 12.
    Source Beta architecture MLEngine VPC Front End Cluster Case N Cluster JSON + Vault Creds Case Specific Secure Data Pipelines Data Repos Case Data Object Store (Encrypted) Data Scientist Specific Notebooks Private Subnets Amazon ECS Amazon ECS
  • 13.
    At the beginningwe worked with “vanilla” Docker
  • 14.
    Now we havetuned and optimized DTR for us
  • 15.
    Front end Services 19 CircleCI Enterprise controller and Circle CI Services 44 Private Terraform Enterprise 6 UCP and DTR with replication 75 Applications 51 and growing 21 3 4 5 Metrics: 200+ BAU containers : 20+ containers with each new user application Each application takes on 17 or more new containers depending on data size
  • 16.
    Docker Journey in buildingSource What’s next for us
  • 17.
    Production Environments Docker TrustedRegistry- External Docker UCP Production Environments Version Control Docker UCP Internal Production EnvironmentsSource Platform Development Development Client Client A Datacenter Client B Datacenter Docker Trusted Registry - Internal Docker for Image signing across the entire software supply chain for ourselves and our clients Docker UCP Production
  • 18.
    Analytics deployment atEdge Secure Client Container Trusted Registry Monitoring, orchestration, scale Analytics Package (managed by Source) Nodes Analytics Package Nodes Analytics Package Manager Application Node E D G E Source Edge Deployment Service Edge Command Center Source core platform
  • 19.