SlideShare a Scribd company logo
1 of 30
Download to read offline
Looking from Above:
Object Detection and Other
Computer Vision Tasks on Satellite
Imagery
Xiaoyong Zhu, Siyu Yang
Microsoft
Satellite Imagery of Las Vegas from 1984-2018
Source: http://world.time.com/timelapse/
Environmentally related
Others
Cars Overhead with Context
(2016)
Microsoft project on drones for power line
maintenance
(2016)
SpaceNet for building and road extraction
(DeepGlobe workshop at CVPR 2018)
(2016, ongoing)
NWPU-RESISC45 dataset for satellite scene
classification
(2017)
IARPA Functional Map of the World Challenge
(2017-18)
xView
(2018)
Airbus ship detection on Kaggle
(2018)
TGS salt identification from seismic images –
A mining application
• Planet: Understanding the
Amazon from Space –
Kaggle competition
• Classification of chip
into 12 classes
related to rainforest
health
• NOAA drone and aerial
datasets on monk seal and
right whales for
population counting and
identification
• Various land use
segmentation datasets
(DeepGlobe, UC Merced,
Chesapeake Conservancy)
Counting caribou from aerial
imagery (AI for Wildlife
Conservation 2018)
NOAA aircraft surveys use
images to count sea lions,
seals and polar bears
The xView Dataset
 Released by Defense Innovation Unit
Experimental (DIUx) to advance
computer vision solutions for national
defense and disaster response.
 60 classes, 1 million labeled instances,
object detection (bounding boxes)
tasks
xView compared to other computer vision contests, measured by number of labeled instances and
number of classes
Fixed-wing aircraft, Small aircraft, Cargo/passenger planes, Helicopter
Building, Shed, Damaged building, Facility, Shipping container, Shipping
container lot, Vehicle lot, Construction site, Helipad, Pylon, Storage tank
Motorboat, Sailboat, Tugboat, Barge, Fishing vessel, Ferry, Yacht, Container
ship, Oil tanker
Pickup truck, Utility truck, Cargo truck, Truck with box, Truck with Flatbed,
Truck with Liquid, Truck tractor trailer, Cement mixer, Mobile Crane, Straddle
carrier, Excavator, Small car…
vs
1. Class imbalance 2. Fine-grained classification required (Cargo Truck and Pickup Truck)
3. Large scale
variance
4. Input labeled images are
very large (approx. 5000 x
5000 pixels)
5. Label quality
846 released images
Shuffle and split
Approx.5000
pixels
Approx. 5000 pixels
596 training images
250 validation images
while making
sure all 60
categories
represented in
both training and
validation sets
Original image
Determine the desired chip
size (600 x 600 pixel)
Standard
deviation in
these columns
Type of detector Intuition Feature Classic models
One stage Unified proposal and
classification network
Faster at inference Single Shot MultiBox
Detector (SSD), YOLO
Two stage Separate proposal and
classification network
Higher accuracy Faster R-CNN, 2015
Scale Normalization for Image Pyramids (SNIP).
Model Name overall mAP small mAP medium mAP large mAP
SSD + Inception v2 0.16 0.12 0.19 0.18
Faster RCNN + FPN + Deformable Operators
0.19
(+18.75%)
0.17 0.22 0.21
SNIP/SNIPER (+ Data augmentation)
0.22
(+15.79%)
0.17 0.23 0.28
DATA BUILD TRAIN DEPLOY
1M Objects
60 Classes
0.3 Meter resolution
1,415 square
kilometers
Azure Machine
Learning Service
Data Science
VM
SNIPER as the training model
(a variant of Faster R-CNN)
Stored in Azure BLOB storage
Edge devices - iOS
Cloud – Azure Machine
Learning/Azure Kubernetes Service
• A range of detector and
backbone networks to
choose from
• Hyperparameter selected
via a config file
• Mature and relatively well
maintained
• Hard to customize
• Current version does not
support distributed training
• Integrates with
distributed training
framework Horovod
• Poor GPU utilization
(since improved)
• Has a small
community around it
• State-of-the-art result
on COCO
• Does not integrate
with other network
architecture
implementations
• Model is deployed to iOS devices so
field workers can use this to see real
time analysis
• iOS by interns in Microsoft Garage
Team
• https://www.microsoft.com/en-
us/garage/profiles/earth-lens/
• Model is also deployed to Azure Machine Learning Service for scalable
inferencing
xiaoyzhu@microsoft.com
yasiyu@microsoft.com
Looking from Above: Object Detection and Other Computer Vision Tasks on Satellite Imagery
Looking from Above: Object Detection and Other Computer Vision Tasks on Satellite Imagery

More Related Content

What's hot

Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Universitat Politècnica de Catalunya
 
[PR12] PR-063: Peephole predicting network performance before training
[PR12] PR-063: Peephole predicting network performance before training[PR12] PR-063: Peephole predicting network performance before training
[PR12] PR-063: Peephole predicting network performance before trainingTaegyun Jeon
 
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)Matthew O'Toole
 
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural NetworkTraffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Networkivaderivader
 
Deep image generating models
Deep image generating modelsDeep image generating models
Deep image generating modelsLuba Elliott
 
Invertible Denoising Network: A Light Solution for Real Noise Removal
Invertible Denoising Network: A Light Solution for Real Noise RemovalInvertible Denoising Network: A Light Solution for Real Noise Removal
Invertible Denoising Network: A Light Solution for Real Noise Removalivaderivader
 
Deep Learning behind Prisma
Deep Learning behind PrismaDeep Learning behind Prisma
Deep Learning behind Prismalostleaves
 
Object Pose Estimation
Object Pose EstimationObject Pose Estimation
Object Pose EstimationArithmer Inc.
 
TensorFlow-KR 3rd meetup - Lightning Talk for SI Analytics
TensorFlow-KR 3rd meetup - Lightning Talk for SI AnalyticsTensorFlow-KR 3rd meetup - Lightning Talk for SI Analytics
TensorFlow-KR 3rd meetup - Lightning Talk for SI AnalyticsTaegyun Jeon
 
DNR - Auto deep lab paper review ppt
DNR - Auto deep lab paper review pptDNR - Auto deep lab paper review ppt
DNR - Auto deep lab paper review ppttaeseon ryu
 
Deep convolutional neural fields for depth estimation from a single image
Deep convolutional neural fields for depth estimation from a single imageDeep convolutional neural fields for depth estimation from a single image
Deep convolutional neural fields for depth estimation from a single imageWei Yang
 
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
[Mmlab seminar 2016] deep learning for human pose estimation
[Mmlab seminar 2016] deep learning for human pose estimation[Mmlab seminar 2016] deep learning for human pose estimation
[Mmlab seminar 2016] deep learning for human pose estimationWei Yang
 
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Sunando Sengupta
 
Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...Dmytro Mishkin
 
Manifold learning
Manifold learningManifold learning
Manifold learningWei Yang
 

What's hot (20)

Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
 
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
 
[PR12] PR-063: Peephole predicting network performance before training
[PR12] PR-063: Peephole predicting network performance before training[PR12] PR-063: Peephole predicting network performance before training
[PR12] PR-063: Peephole predicting network performance before training
 
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)
SIGGRAPH 2014 Course on Computational Cameras and Displays (part 3)
 
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural NetworkTraffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Network
 
Deep image generating models
Deep image generating modelsDeep image generating models
Deep image generating models
 
Invertible Denoising Network: A Light Solution for Real Noise Removal
Invertible Denoising Network: A Light Solution for Real Noise RemovalInvertible Denoising Network: A Light Solution for Real Noise Removal
Invertible Denoising Network: A Light Solution for Real Noise Removal
 
Deep Learning behind Prisma
Deep Learning behind PrismaDeep Learning behind Prisma
Deep Learning behind Prisma
 
Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)
 
Object Pose Estimation
Object Pose EstimationObject Pose Estimation
Object Pose Estimation
 
TensorFlow-KR 3rd meetup - Lightning Talk for SI Analytics
TensorFlow-KR 3rd meetup - Lightning Talk for SI AnalyticsTensorFlow-KR 3rd meetup - Lightning Talk for SI Analytics
TensorFlow-KR 3rd meetup - Lightning Talk for SI Analytics
 
Deep Learning for Computer Vision: Segmentation (UPC 2016)
Deep Learning for Computer Vision: Segmentation (UPC 2016)Deep Learning for Computer Vision: Segmentation (UPC 2016)
Deep Learning for Computer Vision: Segmentation (UPC 2016)
 
DNR - Auto deep lab paper review ppt
DNR - Auto deep lab paper review pptDNR - Auto deep lab paper review ppt
DNR - Auto deep lab paper review ppt
 
Deep convolutional neural fields for depth estimation from a single image
Deep convolutional neural fields for depth estimation from a single imageDeep convolutional neural fields for depth estimation from a single image
Deep convolutional neural fields for depth estimation from a single image
 
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
 
[Mmlab seminar 2016] deep learning for human pose estimation
[Mmlab seminar 2016] deep learning for human pose estimation[Mmlab seminar 2016] deep learning for human pose estimation
[Mmlab seminar 2016] deep learning for human pose estimation
 
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
 
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
 
Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...
 
Manifold learning
Manifold learningManifold learning
Manifold learning
 

Similar to Looking from Above: Object Detection and Other Computer Vision Tasks on Satellite Imagery

AWS re:Invent 2016: Auto Scaling – the Fleet Management Solution for Planet E...
AWS re:Invent 2016: Auto Scaling – the Fleet Management Solution for Planet E...AWS re:Invent 2016: Auto Scaling – the Fleet Management Solution for Planet E...
AWS re:Invent 2016: Auto Scaling – the Fleet Management Solution for Planet E...Amazon Web Services
 
Convolutional Neural Network for pixel-wise skyline detection
Convolutional Neural Network for pixel-wise skyline detectionConvolutional Neural Network for pixel-wise skyline detection
Convolutional Neural Network for pixel-wise skyline detectionDarian Frajberg
 
2018 GIS in the Rockies Vendor Showcase (Th): ERDAS Imagine What's New and Ti...
2018 GIS in the Rockies Vendor Showcase (Th): ERDAS Imagine What's New and Ti...2018 GIS in the Rockies Vendor Showcase (Th): ERDAS Imagine What's New and Ti...
2018 GIS in the Rockies Vendor Showcase (Th): ERDAS Imagine What's New and Ti...GIS in the Rockies
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in VisionSangmin Woo
 
Fuelling the AI Revolution with Gaming
Fuelling the AI Revolution with GamingFuelling the AI Revolution with Gaming
Fuelling the AI Revolution with GamingAlison B. Lowndes
 
陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰台灣資料科學年會
 
The next generation of the Montage image mosaic engine
The next generation of the Montage image mosaic engineThe next generation of the Montage image mosaic engine
The next generation of the Montage image mosaic engineG. Bruce Berriman
 
Udacity-Didi Challenge Finalists
Udacity-Didi Challenge FinalistsUdacity-Didi Challenge Finalists
Udacity-Didi Challenge FinalistsDavid Silver
 
Creating A Multi-wavelength Galactic Plane Atlas With Amazon Web Services
Creating A Multi-wavelength Galactic Plane Atlas With Amazon Web ServicesCreating A Multi-wavelength Galactic Plane Atlas With Amazon Web Services
Creating A Multi-wavelength Galactic Plane Atlas With Amazon Web Services G. Bruce Berriman
 
Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryUsing HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryWai Nwe Tun
 
Deep image retrieval learning global representations for image search
Deep image retrieval  learning global representations for image searchDeep image retrieval  learning global representations for image search
Deep image retrieval learning global representations for image searchUniversitat Politècnica de Catalunya
 
Optimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone MLOptimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone MLSpark Summit
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceIan Foster
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSGanesan Narayanasamy
 
Agents In An Exponential World Foster
Agents In An Exponential World FosterAgents In An Exponential World Foster
Agents In An Exponential World FosterIan Foster
 
Self Automated Rovers
Self Automated RoversSelf Automated Rovers
Self Automated RoversRutikBhoyar
 
Efficient Approach for Content Based Image Retrieval Using Multiple SVM in YA...
Efficient Approach for Content Based Image Retrieval Using Multiple SVM in YA...Efficient Approach for Content Based Image Retrieval Using Multiple SVM in YA...
Efficient Approach for Content Based Image Retrieval Using Multiple SVM in YA...csandit
 
EFFICIENT APPROACH FOR CONTENT BASED IMAGE RETRIEVAL USING MULTIPLE SVM IN YA...
EFFICIENT APPROACH FOR CONTENT BASED IMAGE RETRIEVAL USING MULTIPLE SVM IN YA...EFFICIENT APPROACH FOR CONTENT BASED IMAGE RETRIEVAL USING MULTIPLE SVM IN YA...
EFFICIENT APPROACH FOR CONTENT BASED IMAGE RETRIEVAL USING MULTIPLE SVM IN YA...cscpconf
 
Person Detection in Maritime Search And Rescue Operations
Person Detection in Maritime Search And Rescue OperationsPerson Detection in Maritime Search And Rescue Operations
Person Detection in Maritime Search And Rescue OperationsIRJET Journal
 

Similar to Looking from Above: Object Detection and Other Computer Vision Tasks on Satellite Imagery (20)

AWS re:Invent 2016: Auto Scaling – the Fleet Management Solution for Planet E...
AWS re:Invent 2016: Auto Scaling – the Fleet Management Solution for Planet E...AWS re:Invent 2016: Auto Scaling – the Fleet Management Solution for Planet E...
AWS re:Invent 2016: Auto Scaling – the Fleet Management Solution for Planet E...
 
Convolutional Neural Network for pixel-wise skyline detection
Convolutional Neural Network for pixel-wise skyline detectionConvolutional Neural Network for pixel-wise skyline detection
Convolutional Neural Network for pixel-wise skyline detection
 
2018 GIS in the Rockies Vendor Showcase (Th): ERDAS Imagine What's New and Ti...
2018 GIS in the Rockies Vendor Showcase (Th): ERDAS Imagine What's New and Ti...2018 GIS in the Rockies Vendor Showcase (Th): ERDAS Imagine What's New and Ti...
2018 GIS in the Rockies Vendor Showcase (Th): ERDAS Imagine What's New and Ti...
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in Vision
 
Fuelling the AI Revolution with Gaming
Fuelling the AI Revolution with GamingFuelling the AI Revolution with Gaming
Fuelling the AI Revolution with Gaming
 
陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰
 
The next generation of the Montage image mosaic engine
The next generation of the Montage image mosaic engineThe next generation of the Montage image mosaic engine
The next generation of the Montage image mosaic engine
 
Udacity-Didi Challenge Finalists
Udacity-Didi Challenge FinalistsUdacity-Didi Challenge Finalists
Udacity-Didi Challenge Finalists
 
Creating A Multi-wavelength Galactic Plane Atlas With Amazon Web Services
Creating A Multi-wavelength Galactic Plane Atlas With Amazon Web ServicesCreating A Multi-wavelength Galactic Plane Atlas With Amazon Web Services
Creating A Multi-wavelength Galactic Plane Atlas With Amazon Web Services
 
Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV ImageryUsing HOG Descriptors on Superpixels for Human Detection of UAV Imagery
Using HOG Descriptors on Superpixels for Human Detection of UAV Imagery
 
Deep image retrieval learning global representations for image search
Deep image retrieval  learning global representations for image searchDeep image retrieval  learning global representations for image search
Deep image retrieval learning global representations for image search
 
Optimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone MLOptimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone ML
 
Big Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental ScienceBig Data, Big Computing, AI, and Environmental Science
Big Data, Big Computing, AI, and Environmental Science
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUS
 
Agents In An Exponential World Foster
Agents In An Exponential World FosterAgents In An Exponential World Foster
Agents In An Exponential World Foster
 
Self Automated Rovers
Self Automated RoversSelf Automated Rovers
Self Automated Rovers
 
Efficient Approach for Content Based Image Retrieval Using Multiple SVM in YA...
Efficient Approach for Content Based Image Retrieval Using Multiple SVM in YA...Efficient Approach for Content Based Image Retrieval Using Multiple SVM in YA...
Efficient Approach for Content Based Image Retrieval Using Multiple SVM in YA...
 
EFFICIENT APPROACH FOR CONTENT BASED IMAGE RETRIEVAL USING MULTIPLE SVM IN YA...
EFFICIENT APPROACH FOR CONTENT BASED IMAGE RETRIEVAL USING MULTIPLE SVM IN YA...EFFICIENT APPROACH FOR CONTENT BASED IMAGE RETRIEVAL USING MULTIPLE SVM IN YA...
EFFICIENT APPROACH FOR CONTENT BASED IMAGE RETRIEVAL USING MULTIPLE SVM IN YA...
 
DefenseTalk_Trimmed
DefenseTalk_TrimmedDefenseTalk_Trimmed
DefenseTalk_Trimmed
 
Person Detection in Maritime Search And Rescue Operations
Person Detection in Maritime Search And Rescue OperationsPerson Detection in Maritime Search And Rescue Operations
Person Detection in Maritime Search And Rescue Operations
 

Recently uploaded

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 

Recently uploaded (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Looking from Above: Object Detection and Other Computer Vision Tasks on Satellite Imagery

  • 1. Looking from Above: Object Detection and Other Computer Vision Tasks on Satellite Imagery Xiaoyong Zhu, Siyu Yang Microsoft
  • 2.
  • 3.
  • 4. Satellite Imagery of Las Vegas from 1984-2018 Source: http://world.time.com/timelapse/
  • 5. Environmentally related Others Cars Overhead with Context (2016) Microsoft project on drones for power line maintenance (2016) SpaceNet for building and road extraction (DeepGlobe workshop at CVPR 2018) (2016, ongoing) NWPU-RESISC45 dataset for satellite scene classification (2017) IARPA Functional Map of the World Challenge (2017-18) xView (2018) Airbus ship detection on Kaggle (2018) TGS salt identification from seismic images – A mining application • Planet: Understanding the Amazon from Space – Kaggle competition • Classification of chip into 12 classes related to rainforest health • NOAA drone and aerial datasets on monk seal and right whales for population counting and identification • Various land use segmentation datasets (DeepGlobe, UC Merced, Chesapeake Conservancy)
  • 6. Counting caribou from aerial imagery (AI for Wildlife Conservation 2018) NOAA aircraft surveys use images to count sea lions, seals and polar bears
  • 7.
  • 8.
  • 9. The xView Dataset  Released by Defense Innovation Unit Experimental (DIUx) to advance computer vision solutions for national defense and disaster response.  60 classes, 1 million labeled instances, object detection (bounding boxes) tasks xView compared to other computer vision contests, measured by number of labeled instances and number of classes Fixed-wing aircraft, Small aircraft, Cargo/passenger planes, Helicopter Building, Shed, Damaged building, Facility, Shipping container, Shipping container lot, Vehicle lot, Construction site, Helipad, Pylon, Storage tank Motorboat, Sailboat, Tugboat, Barge, Fishing vessel, Ferry, Yacht, Container ship, Oil tanker Pickup truck, Utility truck, Cargo truck, Truck with box, Truck with Flatbed, Truck with Liquid, Truck tractor trailer, Cement mixer, Mobile Crane, Straddle carrier, Excavator, Small car…
  • 10. vs 1. Class imbalance 2. Fine-grained classification required (Cargo Truck and Pickup Truck) 3. Large scale variance 4. Input labeled images are very large (approx. 5000 x 5000 pixels) 5. Label quality
  • 11. 846 released images Shuffle and split Approx.5000 pixels Approx. 5000 pixels 596 training images 250 validation images while making sure all 60 categories represented in both training and validation sets
  • 12. Original image Determine the desired chip size (600 x 600 pixel)
  • 13.
  • 15.
  • 16. Type of detector Intuition Feature Classic models One stage Unified proposal and classification network Faster at inference Single Shot MultiBox Detector (SSD), YOLO Two stage Separate proposal and classification network Higher accuracy Faster R-CNN, 2015
  • 17. Scale Normalization for Image Pyramids (SNIP).
  • 18. Model Name overall mAP small mAP medium mAP large mAP SSD + Inception v2 0.16 0.12 0.19 0.18 Faster RCNN + FPN + Deformable Operators 0.19 (+18.75%) 0.17 0.22 0.21 SNIP/SNIPER (+ Data augmentation) 0.22 (+15.79%) 0.17 0.23 0.28
  • 19.
  • 20.
  • 21.
  • 22. DATA BUILD TRAIN DEPLOY 1M Objects 60 Classes 0.3 Meter resolution 1,415 square kilometers Azure Machine Learning Service Data Science VM SNIPER as the training model (a variant of Faster R-CNN) Stored in Azure BLOB storage Edge devices - iOS Cloud – Azure Machine Learning/Azure Kubernetes Service
  • 23. • A range of detector and backbone networks to choose from • Hyperparameter selected via a config file • Mature and relatively well maintained • Hard to customize • Current version does not support distributed training • Integrates with distributed training framework Horovod • Poor GPU utilization (since improved) • Has a small community around it • State-of-the-art result on COCO • Does not integrate with other network architecture implementations
  • 24.
  • 25.
  • 26. • Model is deployed to iOS devices so field workers can use this to see real time analysis • iOS by interns in Microsoft Garage Team • https://www.microsoft.com/en- us/garage/profiles/earth-lens/
  • 27. • Model is also deployed to Azure Machine Learning Service for scalable inferencing