SlideShare a Scribd company logo
1 of 35
AI for Good @
Deep Art Generation, Unsupervised Object Detection,
and Distributed Intelligent Microservices
Mark Hamilton
marhamil@microsoft.com
Overview
 AI for Cultural Institutions: The MET
 GAN image synthesis
 Reverse Image Search
 Intelligent Image Search
 AI for Earth: The Snow Leopard Trust
 Unsupervised Object Detection
 LIME
 AI for Accessibility: Seeing AI
 Currency Detection
 The Tech Stack
 Spark, MMLSpark, Kubernetes, Microservices
Seeing AI
AI for Cultural Institutions
Celebrating 2 years of Open Access at The MET
4
https://gen.studio
 In 2016 The MET Released 400k
Images under Open Access
 This past Winter the MET released a
new subject-keyword dataset of
image annotations
 MIT, The MET, and Microsoft
participated in a 3-day hackathon to
create intelligent experiences using
the new collection
Goals:
Create new
works of art
Use new work
to explore
existing art
Explore further
with intelligent
search
Generative
Adversarial
Networks
Reverse Image
Search
Elasticsearch
with Cognitive
Services
Needed Technologies:
Generative Adversarial Art
https://gen.studio
Real or
Generated?
Noise
Vector
Generator
Generated
Image
Training Data
Discriminator
Real or
Generated?
Custom Reverse Image Search
Filters from Zeiler + Fergus 2013
Query
Image
ResNet
Featurizer
Deep
Features
Closest
Match
Fast Nearest
Neighbor
Lookup
MMLSpark SparkML LSH or Annoy
Example Nearest NeighborsQueryImages
Nearest
Neighbors
Intelligent Search Index
 Pipe images through
Computer Vision API
to annotate image for
searching
 Stream images and
intelligent
annotations to Azure
Search
A picture
containing a
person
A picture
containing a
glass, cup
A fish
swimming
underwater
Query
Image:
Describe
Image
Output:
Deep
Feature
Nearest
Neighbors:
End Application: Gen Studio
10
https://gen.studio
AI for Earth
Endangered Status Matters
Classifying all 1.3
million images will
take 20k hours
Remote Camera Traps
Creating a labelled Training Dataset
Creating a labelled Training Dataset
Input Images
Low Level
Features
Mid Level
Features
Logistic
Regression
Output Classes
Filters from Zeiler + Fergus 2013
Any SparkML
Model
Other
Leopard
Predictions
Transfer Learning with ResNet 50
Performance
With Deep Featurization,
Augmentation, and
Temporal Ensembling
Without Deep
Featurization
Accuracy 65.6% Accuracy 94.7%
Goal: Identify Individual Leopards
Source: HotSpotter - Patterned Species Instance Recognition
Automating Detection with LIME on Spark
LIME on Spark
End to End Architecture
Results
Human Labels
Unsupervised
FRCNN Outputs Human Labels
Unsupervised
FRCNN Outputs
AI for Accessibility
Seeing AI
Currency Identification
A Familiar Architecture…
Union +
Distinct
MobileNet
Logistic
Regression
Bing
Image
Search
Deep Features
Azure Machine
Learning +
Spark Serving
Queries
1 Dollar
5 Dollars
10 Dollars
20 Dollars
Labelled Images
$1
$5
$2
0
Prep Data Train Deploy
The Tech Stack
 A fault-tolerant distributed
computing framework
 Map Reduce + SQL
 Whole program optimization +
query pushdown
 Elastic
 Scala, Python, R, Java, Julia
 ML, Graph Processing, Streaming
Driver
Worker
110
0.9
0
20
40
60
80
100
120
Running Time(s)
Hadoop Spark
Worker Worker
Data Source (ADLS, Blob, Local, Cosmos, etc)
Age:
Int
Name:
String
15 Bob
25 Alice
33 Mary
Age:
Int
Name:
String
1 Sam
2 Claire
53 Tina
Age:
Int
Name:
String
25 Bob
25 Tom
66 Ted
● Code
● Data
ML
 High level library for distributed
machine learning
 More general than SciKit-Learn
 All models have a uniform
interface
Can compose models into
complex pipelines
Can save, load, and transport
models
Load Data
Tokenizer
Term Hashing
Logistic Reg
Evaluate
Pipeline
Evaluate
Load Data
● Estimator
● Transformer
30#UnifiedAnalytics #SparkAISummit
Microsoft Machine Learning for
Apache Spark v0.17
Microsoft’s Open Source
Contributions to Apache Spark
www.aka.ms/spark Azure/mmlspark
Cognitive
Services
Spark Serving Model
Interpretability
LightGBM Gradient
Boosting
Deep Networks
with CNTK
HTTP on
Spark
CNTK on Spark
 GPU/CPU acceleration
 Automated gradient calculations
 Flexible language for defining
huge space of models
 State of the art performance and
accuracy
 ONNX Runtime
 Large scale parallelism
 Fault tolerance
 Auto-scaling / elasticity
 High throughput streaming
 Data source + orchestrator
agnostic
Azure Cognitive Services on Spark
 Easy to use integration
between Spark and the
Azure Cognitive Services
 Composable and
pipelinable with all other
SparkML models!
 Python, Scala, R (Beta) val df = new TextSentiment()
.setTextCol(“text”)
.setOutputCol(“sentiment”)
.transform(inputs)
Spark Worker
Partition Partition Partition
Client Client Client
Web Service
Spark Serving
 Sub-millisecond RESTful Model
Deployment on Spark Clusters
Spark Worker
Partition Partition Partition
Server
Spark Worker
Partition Partition Partition
Server
Spark Master
Users / Apps
Load Balancer HTTP Requests and
Responses
spark.read.parquet.load(…)
.select(…)
spark.readStream.kafka.load(…)
.select(…)
Batch API:
Streaming API:
Serving API:
spark.readStream.server(“0.0.0.0”, 5000).load(…)
.select(…)
Azure Kubernetes Service + Helm
• Works on any k8s cluster
• Helm: Package Manager
for Kubernetes
34
Kubernetes (AKS, ACS, GKE, On-Prem etc)
K8s workerK8s worker
Spark
Worker
Spark
Worker
K8s worker
Cognitive
Service
Container
HTTP on Spark
Spark
Worker
Cognitive
Service
Container
HTTP on Spark
Spark
Worker
Cognitive
Service
Container
HTTP on Spark
Spark
Serving Load
Balancer
Jupyter,
Zepplin,
LIVY, or
Spark
Submit LB
Zepplin
Jupyter
Storage or
other
Databases
Cloud
Cognitive
Services
Spark Serving Hotpath
HTTP on Spark
Spark Readers
REST Requests to
Deployed Models
Submit Jobs, Run Notebooks,
Manage Cluster, etc
Users / Apps
helm repo add mmlspark 
https://dbanda.github.io/charts
helm install mmlspark/spark 
--set localTextApi=true
Dalitso Banda, dbanda@microsoft.com
Microsoft AI Development Acceleration Program
Thanks to
 You all!
 Joseph Sirosh, Sudarshan Raghunathan, Ilya Matiach, Eli Barzilay, Tong
Wen
 Microsoft NERD Garage Team + MIT Externship Program
 Microsoft Development Acceleration Team:
 Dalitso Banda, Casey Hong, Karthik Rajendran, Manon Knoertzer, Tayo
Amuneke, Alejandro Buendia
 Pablo Castro, Chris Hoder, Ryan Gaspar, Henrik Neilsen, Andrew
Schonhoffer, Daniel Ciborowski, Markus Cosowicz
 Azure CAT, AzureML, and Azure Search Teams

More Related Content

What's hot

Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Kai Wähner
 
El camino a las Cloud Native Apps - Azure AI
El camino a las Cloud Native Apps - Azure AIEl camino a las Cloud Native Apps - Azure AI
El camino a las Cloud Native Apps - Azure AI
Plain Concepts
 
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Kai Wähner
 

What's hot (20)

IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
 
Blind spots in big data erez koren @ forter
Blind spots in big data erez koren @ forterBlind spots in big data erez koren @ forter
Blind spots in big data erez koren @ forter
 
Machine Learning Trends of 2018 combined with the Apache Kafka Ecosystem
Machine Learning Trends of 2018 combined with the Apache Kafka EcosystemMachine Learning Trends of 2018 combined with the Apache Kafka Ecosystem
Machine Learning Trends of 2018 combined with the Apache Kafka Ecosystem
 
Amazon ElasticSearch Service
Amazon ElasticSearch Service  Amazon ElasticSearch Service
Amazon ElasticSearch Service
 
One Kubernetes to rule them all (ZEUS 2019 Keynote)
One Kubernetes to rule them all (ZEUS 2019 Keynote)One Kubernetes to rule them all (ZEUS 2019 Keynote)
One Kubernetes to rule them all (ZEUS 2019 Keynote)
 
Deep Learning with Microsoft Cognitive Toolkit
Deep Learning with Microsoft Cognitive ToolkitDeep Learning with Microsoft Cognitive Toolkit
Deep Learning with Microsoft Cognitive Toolkit
 
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
Deep Learning Streaming Platform with Kafka Streams, TensorFlow, DeepLearning...
 
ConFoo - Exploring .NET’s memory management – a trip down memory lane
ConFoo - Exploring .NET’s memory management – a trip down memory laneConFoo - Exploring .NET’s memory management – a trip down memory lane
ConFoo - Exploring .NET’s memory management – a trip down memory lane
 
Azure によるスピードレイヤの分析アーキテクチャ
Azure によるスピードレイヤの分析アーキテクチャAzure によるスピードレイヤの分析アーキテクチャ
Azure によるスピードレイヤの分析アーキテクチャ
 
El camino a las Cloud Native Apps - Azure AI
El camino a las Cloud Native Apps - Azure AIEl camino a las Cloud Native Apps - Azure AI
El camino a las Cloud Native Apps - Azure AI
 
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
 
Get Value From Your Data
Get Value From Your DataGet Value From Your Data
Get Value From Your Data
 
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
Kai Waehner - KSQL – The Open Source SQL Streaming Engine for Apache Kafka - ...
 
Data analytics master class: predict hotel revenue
Data analytics master class: predict hotel revenueData analytics master class: predict hotel revenue
Data analytics master class: predict hotel revenue
 
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
How to Leverage the Apache Kafka Ecosystem to Productionize Machine Learning ...
 
Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018
 
PuppetConf 2017: Moving faster with Puppet & Splunk- Hal Rottenberg, Andrew B...
PuppetConf 2017: Moving faster with Puppet & Splunk- Hal Rottenberg, Andrew B...PuppetConf 2017: Moving faster with Puppet & Splunk- Hal Rottenberg, Andrew B...
PuppetConf 2017: Moving faster with Puppet & Splunk- Hal Rottenberg, Andrew B...
 
(BDT306) Mission-Critical Stream Processing with Amazon EMR and Amazon Kinesi...
(BDT306) Mission-Critical Stream Processing with Amazon EMR and Amazon Kinesi...(BDT306) Mission-Critical Stream Processing with Amazon EMR and Amazon Kinesi...
(BDT306) Mission-Critical Stream Processing with Amazon EMR and Amazon Kinesi...
 
Multi runtime serving pipelines for machine learning
Multi runtime serving pipelines for machine learningMulti runtime serving pipelines for machine learning
Multi runtime serving pipelines for machine learning
 
Microsoft Machine Learning Server. Architecture View
Microsoft Machine Learning Server. Architecture ViewMicrosoft Machine Learning Server. Architecture View
Microsoft Machine Learning Server. Architecture View
 

Similar to AI for Good at Microsoft

Similar to AI for Good at Microsoft (20)

Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei ZahariaDeep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
 
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei ZahariaDeep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
 
The Azure Cognitive Services on Spark: Clusters with Embedded Intelligent Ser...
The Azure Cognitive Services on Spark: Clusters with Embedded Intelligent Ser...The Azure Cognitive Services on Spark: Clusters with Embedded Intelligent Ser...
The Azure Cognitive Services on Spark: Clusters with Embedded Intelligent Ser...
 
Yu's resume
Yu's resumeYu's resume
Yu's resume
 
Microsoft AI Platform Overview
Microsoft AI Platform OverviewMicrosoft AI Platform Overview
Microsoft AI Platform Overview
 
Tools of Research: Machine Learning | AWS Public Sector Summit 2017
Tools of Research: Machine Learning | AWS Public Sector Summit 2017Tools of Research: Machine Learning | AWS Public Sector Summit 2017
Tools of Research: Machine Learning | AWS Public Sector Summit 2017
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
 
Microsoft & Machine Learning / Artificial Intelligence
Microsoft & Machine Learning / Artificial IntelligenceMicrosoft & Machine Learning / Artificial Intelligence
Microsoft & Machine Learning / Artificial Intelligence
 
Spark streaming State of the Union - Strata San Jose 2015
Spark streaming State of the Union - Strata San Jose 2015Spark streaming State of the Union - Strata San Jose 2015
Spark streaming State of the Union - Strata San Jose 2015
 
Big data apache spark + scala
Big data   apache spark + scalaBig data   apache spark + scala
Big data apache spark + scala
 
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...
Expanding Apache Spark Use Cases in 2.2 and Beyond with Matei Zaharia and dem...
 
AWS re:Invent 2016 : announcement, technical demos and feedbacks
AWS re:Invent 2016 : announcement, technical demos and feedbacksAWS re:Invent 2016 : announcement, technical demos and feedbacks
AWS re:Invent 2016 : announcement, technical demos and feedbacks
 
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
Build Deep Learning Applications for Big Data Platforms (CVPR 2018 tutorial)
 
Designing Artificial Intelligence
Designing Artificial IntelligenceDesigning Artificial Intelligence
Designing Artificial Intelligence
 
Strata NYC 2015 - What's coming for the Spark community
Strata NYC 2015 - What's coming for the Spark communityStrata NYC 2015 - What's coming for the Spark community
Strata NYC 2015 - What's coming for the Spark community
 
Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...
Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...
Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...
 
IBM Strategy for Spark
IBM Strategy for SparkIBM Strategy for Spark
IBM Strategy for Spark
 
Leonard Austin (Ravelin) - DevOps in a Machine Learning World
Leonard Austin (Ravelin) - DevOps in a Machine Learning WorldLeonard Austin (Ravelin) - DevOps in a Machine Learning World
Leonard Austin (Ravelin) - DevOps in a Machine Learning World
 
Machine Learning on the Cloud with Apache MXNet
Machine Learning on the Cloud with Apache MXNetMachine Learning on the Cloud with Apache MXNet
Machine Learning on the Cloud with Apache MXNet
 
Khan farhan cv
Khan farhan cvKhan farhan cv
Khan farhan cv
 

Recently uploaded

CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
Wonjun Hwang
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 

Recently uploaded (20)

How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
الأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهلهالأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهله
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
How to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in PakistanHow to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in Pakistan
 

AI for Good at Microsoft

  • 1. AI for Good @ Deep Art Generation, Unsupervised Object Detection, and Distributed Intelligent Microservices Mark Hamilton marhamil@microsoft.com
  • 2. Overview  AI for Cultural Institutions: The MET  GAN image synthesis  Reverse Image Search  Intelligent Image Search  AI for Earth: The Snow Leopard Trust  Unsupervised Object Detection  LIME  AI for Accessibility: Seeing AI  Currency Detection  The Tech Stack  Spark, MMLSpark, Kubernetes, Microservices Seeing AI
  • 3. AI for Cultural Institutions
  • 4. Celebrating 2 years of Open Access at The MET 4 https://gen.studio  In 2016 The MET Released 400k Images under Open Access  This past Winter the MET released a new subject-keyword dataset of image annotations  MIT, The MET, and Microsoft participated in a 3-day hackathon to create intelligent experiences using the new collection
  • 5. Goals: Create new works of art Use new work to explore existing art Explore further with intelligent search Generative Adversarial Networks Reverse Image Search Elasticsearch with Cognitive Services Needed Technologies:
  • 6. Generative Adversarial Art https://gen.studio Real or Generated? Noise Vector Generator Generated Image Training Data Discriminator Real or Generated?
  • 7. Custom Reverse Image Search Filters from Zeiler + Fergus 2013 Query Image ResNet Featurizer Deep Features Closest Match Fast Nearest Neighbor Lookup MMLSpark SparkML LSH or Annoy
  • 9. Intelligent Search Index  Pipe images through Computer Vision API to annotate image for searching  Stream images and intelligent annotations to Azure Search A picture containing a person A picture containing a glass, cup A fish swimming underwater Query Image: Describe Image Output: Deep Feature Nearest Neighbors:
  • 10. End Application: Gen Studio 10 https://gen.studio
  • 13. Classifying all 1.3 million images will take 20k hours Remote Camera Traps
  • 14.
  • 15. Creating a labelled Training Dataset
  • 16. Creating a labelled Training Dataset
  • 17. Input Images Low Level Features Mid Level Features Logistic Regression Output Classes Filters from Zeiler + Fergus 2013 Any SparkML Model Other Leopard Predictions Transfer Learning with ResNet 50
  • 18. Performance With Deep Featurization, Augmentation, and Temporal Ensembling Without Deep Featurization Accuracy 65.6% Accuracy 94.7%
  • 19. Goal: Identify Individual Leopards Source: HotSpotter - Patterned Species Instance Recognition
  • 20. Automating Detection with LIME on Spark
  • 22. End to End Architecture
  • 23. Results Human Labels Unsupervised FRCNN Outputs Human Labels Unsupervised FRCNN Outputs
  • 26. A Familiar Architecture… Union + Distinct MobileNet Logistic Regression Bing Image Search Deep Features Azure Machine Learning + Spark Serving Queries 1 Dollar 5 Dollars 10 Dollars 20 Dollars Labelled Images $1 $5 $2 0 Prep Data Train Deploy
  • 28.  A fault-tolerant distributed computing framework  Map Reduce + SQL  Whole program optimization + query pushdown  Elastic  Scala, Python, R, Java, Julia  ML, Graph Processing, Streaming Driver Worker 110 0.9 0 20 40 60 80 100 120 Running Time(s) Hadoop Spark Worker Worker Data Source (ADLS, Blob, Local, Cosmos, etc) Age: Int Name: String 15 Bob 25 Alice 33 Mary Age: Int Name: String 1 Sam 2 Claire 53 Tina Age: Int Name: String 25 Bob 25 Tom 66 Ted ● Code ● Data
  • 29. ML  High level library for distributed machine learning  More general than SciKit-Learn  All models have a uniform interface Can compose models into complex pipelines Can save, load, and transport models Load Data Tokenizer Term Hashing Logistic Reg Evaluate Pipeline Evaluate Load Data ● Estimator ● Transformer
  • 30. 30#UnifiedAnalytics #SparkAISummit Microsoft Machine Learning for Apache Spark v0.17 Microsoft’s Open Source Contributions to Apache Spark www.aka.ms/spark Azure/mmlspark Cognitive Services Spark Serving Model Interpretability LightGBM Gradient Boosting Deep Networks with CNTK HTTP on Spark
  • 31. CNTK on Spark  GPU/CPU acceleration  Automated gradient calculations  Flexible language for defining huge space of models  State of the art performance and accuracy  ONNX Runtime  Large scale parallelism  Fault tolerance  Auto-scaling / elasticity  High throughput streaming  Data source + orchestrator agnostic
  • 32. Azure Cognitive Services on Spark  Easy to use integration between Spark and the Azure Cognitive Services  Composable and pipelinable with all other SparkML models!  Python, Scala, R (Beta) val df = new TextSentiment() .setTextCol(“text”) .setOutputCol(“sentiment”) .transform(inputs) Spark Worker Partition Partition Partition Client Client Client Web Service
  • 33. Spark Serving  Sub-millisecond RESTful Model Deployment on Spark Clusters Spark Worker Partition Partition Partition Server Spark Worker Partition Partition Partition Server Spark Master Users / Apps Load Balancer HTTP Requests and Responses spark.read.parquet.load(…) .select(…) spark.readStream.kafka.load(…) .select(…) Batch API: Streaming API: Serving API: spark.readStream.server(“0.0.0.0”, 5000).load(…) .select(…)
  • 34. Azure Kubernetes Service + Helm • Works on any k8s cluster • Helm: Package Manager for Kubernetes 34 Kubernetes (AKS, ACS, GKE, On-Prem etc) K8s workerK8s worker Spark Worker Spark Worker K8s worker Cognitive Service Container HTTP on Spark Spark Worker Cognitive Service Container HTTP on Spark Spark Worker Cognitive Service Container HTTP on Spark Spark Serving Load Balancer Jupyter, Zepplin, LIVY, or Spark Submit LB Zepplin Jupyter Storage or other Databases Cloud Cognitive Services Spark Serving Hotpath HTTP on Spark Spark Readers REST Requests to Deployed Models Submit Jobs, Run Notebooks, Manage Cluster, etc Users / Apps helm repo add mmlspark https://dbanda.github.io/charts helm install mmlspark/spark --set localTextApi=true Dalitso Banda, dbanda@microsoft.com Microsoft AI Development Acceleration Program
  • 35. Thanks to  You all!  Joseph Sirosh, Sudarshan Raghunathan, Ilya Matiach, Eli Barzilay, Tong Wen  Microsoft NERD Garage Team + MIT Externship Program  Microsoft Development Acceleration Team:  Dalitso Banda, Casey Hong, Karthik Rajendran, Manon Knoertzer, Tayo Amuneke, Alejandro Buendia  Pablo Castro, Chris Hoder, Ryan Gaspar, Henrik Neilsen, Andrew Schonhoffer, Daniel Ciborowski, Markus Cosowicz  Azure CAT, AzureML, and Azure Search Teams