Data Refineries as an Example of
Machine Learning (ML) and AI
Creating Jobs
Robert B. Cohen, PhD
Senior Fellow, Economic Strategy Institute, bcohen@bway.net
CITI Conference on Future Employment in the Digital Economy
September 7, 2018
Software Infrastructure, Data Refineries and
Digital Economy Jobs
• The Digital Economy is using Machine Learning algorithms to analyze a
wide range of economic events.
• Data Refineries demonstrate how AI, besides running machines, will create
jobs:
• 1. Create opportunities for new applications of data and jobs to manage data.
Descartes Labs’ work on satellite imagery illustrates this.
• The software includes microservices, containers, Kubernetes, Jupyter, Istio, and TensorFlow.
• The tools or web-based support includes GitHub, DevOps, and the Continuous Integration
and Continuous Delivery of Software.
• The New Infrastructure Jobs will be numerous due to:
• 1. A sizable increase in demand for infrastructure software and tools that
accompanies the move to a digital economy.
• 2. Network effects and elasticity of demand effects, particularly based on the use of
GitHub and Open Source software.
The Hierarchy of Infrastructure Software and Tools
Software Hierarchy: Highest Level Performs the Most Sophisticated Management and Coordination
Software Function Operates Closely with
Highest Level of Hierarchy Istio
"Service Mesh" to coordinate
container clusters and algorithm ML
performance
Machine Learning models, together
with, for example, Google
Kubernetes Engine. Istio can run on
top of Microservices. Can run multi-
tenancy workloads.
Kubernetes
Control for Container clustering and
Platform for ML
Microservices and Containers
Containers
"Packaged" Code for Applications. A
"composable Platform."
Jupyter: Software that supports
interactive data science and
scientific computing across all
different programming languages.
Microservices
"Building Blocks" for Applications.
Multiple Microservices can be in a
single container. ↘
Continuous Integration/Continuous
Delivery (CI/CD)
The process for writing, testing, and
deploying new software
→ GitHub: Source of Code to
Create Software and Deploy It in
Microservices.
Lowest Level of Hierachy DevOps
Software engineering approach that
unifies Development, Deployment
and Operation of Applications.
↗
Data Refineries build
huge data bases for ML
to analyze.
These examples from Descartes Labs
suggest how broad new fields can be
rapidly opened for analysis. This will
require jobs in the following areas:
Data Engineers who gather, clean and
conserve data.
Data Analysts and Data Scientists who
decide how to analyze data and whether
the results answer the questions that
they pose.
Data Wranglers who transform and map
data from one “raw” data form to
another format so it is can be more
valuable for analytics.
Descartes Labs, “Advancing the science of soy forecasting”, https://medium.com/@DescartesLabs/advancing-the-science-of-soy-forecasting-f399bae42b78
Data Refineries Measure Kilauea’s Volcanic
Earth Displacement
Fringes at left indicate earth shifts over time Orange shows heat from recent eruptions at site
Descartes Labs, “Measuring Volcanic Earth Displacement Using Interferometric SAR,” Medium, May 17, 2018.
https://medium.com/@DescartesLabs/measuring-volcanic-earth-displacement-using-interferometric-sar-
f307613b692
Data Refineries Measure Kilauea’s Volcanic
Earth Displacement

Future of jobs and digital economy citi conference 090618

  • 1.
    Data Refineries asan Example of Machine Learning (ML) and AI Creating Jobs Robert B. Cohen, PhD Senior Fellow, Economic Strategy Institute, bcohen@bway.net CITI Conference on Future Employment in the Digital Economy September 7, 2018
  • 2.
    Software Infrastructure, DataRefineries and Digital Economy Jobs • The Digital Economy is using Machine Learning algorithms to analyze a wide range of economic events. • Data Refineries demonstrate how AI, besides running machines, will create jobs: • 1. Create opportunities for new applications of data and jobs to manage data. Descartes Labs’ work on satellite imagery illustrates this. • The software includes microservices, containers, Kubernetes, Jupyter, Istio, and TensorFlow. • The tools or web-based support includes GitHub, DevOps, and the Continuous Integration and Continuous Delivery of Software. • The New Infrastructure Jobs will be numerous due to: • 1. A sizable increase in demand for infrastructure software and tools that accompanies the move to a digital economy. • 2. Network effects and elasticity of demand effects, particularly based on the use of GitHub and Open Source software.
  • 3.
    The Hierarchy ofInfrastructure Software and Tools Software Hierarchy: Highest Level Performs the Most Sophisticated Management and Coordination Software Function Operates Closely with Highest Level of Hierarchy Istio "Service Mesh" to coordinate container clusters and algorithm ML performance Machine Learning models, together with, for example, Google Kubernetes Engine. Istio can run on top of Microservices. Can run multi- tenancy workloads. Kubernetes Control for Container clustering and Platform for ML Microservices and Containers Containers "Packaged" Code for Applications. A "composable Platform." Jupyter: Software that supports interactive data science and scientific computing across all different programming languages. Microservices "Building Blocks" for Applications. Multiple Microservices can be in a single container. ↘ Continuous Integration/Continuous Delivery (CI/CD) The process for writing, testing, and deploying new software → GitHub: Source of Code to Create Software and Deploy It in Microservices. Lowest Level of Hierachy DevOps Software engineering approach that unifies Development, Deployment and Operation of Applications. ↗
  • 4.
    Data Refineries build hugedata bases for ML to analyze. These examples from Descartes Labs suggest how broad new fields can be rapidly opened for analysis. This will require jobs in the following areas: Data Engineers who gather, clean and conserve data. Data Analysts and Data Scientists who decide how to analyze data and whether the results answer the questions that they pose. Data Wranglers who transform and map data from one “raw” data form to another format so it is can be more valuable for analytics. Descartes Labs, “Advancing the science of soy forecasting”, https://medium.com/@DescartesLabs/advancing-the-science-of-soy-forecasting-f399bae42b78
  • 5.
    Data Refineries MeasureKilauea’s Volcanic Earth Displacement Fringes at left indicate earth shifts over time Orange shows heat from recent eruptions at site Descartes Labs, “Measuring Volcanic Earth Displacement Using Interferometric SAR,” Medium, May 17, 2018. https://medium.com/@DescartesLabs/measuring-volcanic-earth-displacement-using-interferometric-sar- f307613b692 Data Refineries Measure Kilauea’s Volcanic Earth Displacement