Large Scale LanduseLarge Scale Landuse
Classification of SatelliteClassification of Satellite
ImageryImagery
Suneel MarthiSuneel Marthi
Jose Luis ContrerasJose Luis Contreras
June 11, 2018June 11, 2018
Berlin Buzzwords, Berlin, GermanyBerlin Buzzwords, Berlin, Germany
1
AgendaAgenda
Introduction
Satellite Image Data Description
Cloud Classification
Segmentation
Apache Beam
Beam Inference Pipeline
Demo
Future Work
2
Goal: Identify Tulip fields from Sentinel-2Goal: Identify Tulip fields from Sentinel-2
satellite imagessatellite images
3
WorkflowWorkflow
4
Data: Sentinel-2Data: Sentinel-2
Earth observation mission from ESAEarth observation mission from ESA
13 spectral bands, from RGB to SWIR (Short Wave13 spectral bands, from RGB to SWIR (Short Wave
Infrared)Infrared)
Spatial resolution: 10m/px (RGB bands)Spatial resolution: 10m/px (RGB bands)
5 day revisit time5 day revisit time
Free and open data policyFree and open data policy
5
Data acquisitionData acquisition
Images downloaded using Sentinel Hub’s WMS (webImages downloaded using Sentinel Hub’s WMS (web
mapping service)mapping service)
Download tool from Matthieu Guillaumin (@mguillau)Download tool from Matthieu Guillaumin (@mguillau)
6
DataData
256 x 256 px images, RGB256 x 256 px images, RGB
7
WorkflowWorkflow
8
Filter CloudsFilter Clouds
Need to remove cloudy images before segmentingNeed to remove cloudy images before segmenting
Approach: train a Neural Network to classify images asApproach: train a Neural Network to classify images as
clear or cloudyclear or cloudy
CNN Architectures: ResNet50 and ResNet101CNN Architectures: ResNet50 and ResNet101
9
ResNet building blockResNet building block
10
Filter Clouds: training dataFilter Clouds: training data
‘Planet: Understanding the Amazon from Space’ Kaggle‘Planet: Understanding the Amazon from Space’ Kaggle
competitioncompetition
40K images labeled as clear, hazy, partly cloudy or40K images labeled as clear, hazy, partly cloudy or
cloudycloudy
11
Filter Clouds: Training data(2)Filter Clouds: Training data(2)
Origin No. of
Images
Cloudy
Images
Kaggle Competition 40000 30%
Sentinel-2(hand
labelled)
5000 50%
Total 45000 32%
Only two classes: clear and cloudy (cloudy = haze +Only two classes: clear and cloudy (cloudy = haze +
partly cloudy + cloudy)partly cloudy + cloudy)
12
Training data splitTraining data split
13
ResultsResults
Model Accuracy F1 Epochs (train +
finetune)
ResNet50 0.983 0.986 23 + 7
ResNet101 0.978 0.982 43 + 9
Choose ResNet50 for filtering cloudy imagesChoose ResNet50 for filtering cloudy images
14
Example ResultsExample Results
15
Data AugmentationData Augmentation
import Augmentor
p = Augmentor.Pipeline(img_dir)
p.skew(probability=0.5, magnitude=0.5)
p.shear(probability=0.3, max_shear=15)
p.flip_left_right(probability=0.5)
p.flip_top_bottom(probability=0.5)
p.rotate_random_90(probability=0.75)
p.rotate(probability=0.75, max_rotation=20)
16
Example Data AugmentationExample Data Augmentation
17
WorkflowWorkflow
18
Segmentation GoalsSegmentation Goals
19
Approach U-NetApproach U-Net
State of the Art CNN for Image Segmentation
Commonly used with biomedical images
Best Architecture for tasks like this
O. Ronneberger, P.Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. arxiv:1505.04597, 2015O. Ronneberger, P.Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. arxiv:1505.04597, 2015
20
U-Net ArchitectureU-Net Architecture
21
U-Net Building BlocksU-Net Building Blocks
def conv_block(channels, kernel_size):
out = nn.HybridSequential()
out.add(
nn.Conv2D(channels, kernel_size, padding=1, use_bias=False
nn.BatchNorm(),
nn.Activation('relu')
)
return out
def down_block(channels):
out = nn.HybridSequential()
out.add(
conv_block(channels, 3),
conv_block(channels, 3)
)
return out
22
U-Net Building Blocks (2)U-Net Building Blocks (2)
class up_block(nn.HybridBlock):
def __init__(self, channels, shrink=True, **kwargs):
super(up_block, self).__init__(**kwargs)
self.upsampler = nn.Conv2DTranspose(channels=channels, ker
strides=2, padding=1,
self.conv1 = conv_block(channels, 1)
self.conv3_0 = conv_block(channels, 3)
if shrink:
self.conv3_1 = conv_block(int(channels/2), 3)
else:
self.conv3_1 = conv_block(channels, 3)
def hybrid_forward(self, F, x, s):
x = self.upsampler(x)
x = self.conv1(x)
x = F.relu(x)
x = F.Crop(*[x,s], center crop=True)
23
U-Net: Training dataU-Net: Training data
Ground truth: tulip fields in the
Netherlands
Provided by Geopedia, from
Sinergise
24
Loss function: Soft Dice Coefficient lossLoss function: Soft Dice Coefficient loss
Prediction = Probability of each pixel belonging to aPrediction = Probability of each pixel belonging to a
Tulip Field (Softmax output)Tulip Field (Softmax output)
ε serves to prevent division by zeroε serves to prevent division by zero
25
Evaluation Metric: Intersection Over Union(IoU)Evaluation Metric: Intersection Over Union(IoU)
AkaAka Jaccard IndexJaccard Index
Similar to Dice coefficient, standard metric for imageSimilar to Dice coefficient, standard metric for image
segmentationsegmentation
26
ResultsResults
IoU = 0.73 after 23 training epochs
Related results: DSTL Kaggle competition
IoU = 0.84 on crop vs building/road/water/etc
segmentation
https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection/discussion/29790https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection/discussion/29790
27
Was ist Apache Beam?Was ist Apache Beam?
Agnostic (unified Batch + Stream) programming
model
Java, Python, Go SDKs
Runners for Dataflow
Apache Flink
Apache Spark
Google Cloud Dataflow
Local DataRunner
28
Warum Apache Beam?Warum Apache Beam?
Portierbar: Code abstraction that can be executed on
different backend runners
Vereinheitlicht: Unified batch and Streaming API
Erweiterbare Modelle und SDK: Extensible API to
define custom sinks and sources
29
End Users: Create
pipelines in a familiar
language
SDK Writers: Make Beam
concepts available in
new languages
Runner Writers: Support
Beam pipelines in
distributed processing
environments
Die Apache Beam VisionDie Apache Beam Vision
30
Inference PipelineInference Pipeline
31
Beam Inference PipelineBeam Inference Pipeline
pipeline_options = PipelineOptions(pipeline_args)
pipeline_options.view_as(SetupOptions).save_main_session = True
pipeline_options.view_as(StandardOptions).streaming = True
with beam.Pipeline(options=pipeline_options) as p:
filtered_images = (p | "Read Images" >> beam.Create(glob.glob
| "Batch elements" >> beam.BatchElements(0, known_args.batchs
| "Filter Cloudy images" >> beam.ParDo(FilterCloudyFn.FilterC
filtered_images | "Segment for Land use" >>
beam.ParDo(UNetInference.UNetInferenceFn(known_args.m
32
Cloud Classifier DoFnCloud Classifier DoFn
class FilterCloudyFn(apache_beam.DoFn):
def process(self, element):
"""
Returns clear images after filtering the cloudy ones
:param element:
:return:
"""
clear_images = []
batch = self.load_batch(element)
batch = batch.as_in_context(self.ctx)
preds = mx.nd.argmax(self.net(batch), axis=1)
idxs = np.arange(len(element))[preds.asnumpy() == 0]
clear_images.extend([element[i] for i in idxs])
yield clear_images
33
U-Net Segmentation DoFnU-Net Segmentation DoFn
class UNetInferenceFn(apache_beam.DoFn):
def save_batch(self, filenames, predictions):
for idx, fn in enumerate(filenames):
base, ext = os.path.splitext(os.path.basename(fn))
mask_name = base + "_predicted_mask" + ext
imsave(os.path.join(self.output, mask_name) , predict
34
DemoDemo
35
No Tulip FieldsNo Tulip Fields
36
Large Tulip FieldsLarge Tulip Fields
37
Small Tulips FieldsSmall Tulips Fields
38
Future WorkFuture Work
39
Classify Rock FormationsClassify Rock Formations
Using Shortwave Infrared images (2.107 - 2.294 nm)Using Shortwave Infrared images (2.107 - 2.294 nm)
Radiant Energy reflected/transmitted per unit timeRadiant Energy reflected/transmitted per unit time
(Radiant Flux)(Radiant Flux)
Eg: Plants don't grow on rocksEg: Plants don't grow on rocks
https://en.wikipedia.org/wiki/Radiant_fluxhttps://en.wikipedia.org/wiki/Radiant_flux
40
Measure Crop HealthMeasure Crop Health
Using Near-Infrared (NIR) radiationUsing Near-Infrared (NIR) radiation
Emitted by plant Chlorophyll and MesophyllEmitted by plant Chlorophyll and Mesophyll
Chlorophyll content differs between plants and plantChlorophyll content differs between plants and plant
stagesstages
Good measure to identify different plants and theirGood measure to identify different plants and their
healthhealth
https://en.wikipedia.org/wiki/Near-infrared_spectroscopy#Agriculturehttps://en.wikipedia.org/wiki/Near-infrared_spectroscopy#Agriculture
41
Use images from Red bandUse images from Red band
Identify borders, regions without much details withIdentify borders, regions without much details with
naked eye - Wonder Why?naked eye - Wonder Why?
Images are in RedbandImages are in Redband
Unsupervised Learning - ClusteringUnsupervised Learning - Clustering
42
CreditsCredits
Jose Contreras, Matthieu Guillaumin, Kellen
Sunderland (Amazon - Berlin)
Ali Abbas (HERE - Frankfurt)
Apache Beam: Pablo Estrada, Lukasz Cwik, Sergei
Sokolenko (Google)
Pascal Hahn, Jed Sundvall (Amazon - Germany)
Apache OpenNLP: Bruno Kinoshita, Joern Kottmann
Stevo Slavic (SAP - Munich)
43
LinksLinks
Earth on AWS: https://aws.amazon.com/earth/
Semantic Segmentation - U-Net:
https://medium.com/@keremturgutlu/semantic-
segmentation-u-net-part-1-d8d6f6005066
ResNet: https://arxiv.org/pdf/1512.03385.pdf
U-Net: https://arxiv.org/pdf/1505.04597.pdf
44
Links (contd)Links (contd)
Apache Beam: https://beam.apache.org
Slides: https://smarthi.github.io/BBuzz18-Satellite-
image-classification-for-landuse
Code: https://github.com/smarthi/satellite-images
45
Fragen ???Fragen ???
46

Landuse Classification from Satellite Imagery using Deep Learning

  • 1.
    Large Scale LanduseLargeScale Landuse Classification of SatelliteClassification of Satellite ImageryImagery Suneel MarthiSuneel Marthi Jose Luis ContrerasJose Luis Contreras June 11, 2018June 11, 2018 Berlin Buzzwords, Berlin, GermanyBerlin Buzzwords, Berlin, Germany 1
  • 2.
    AgendaAgenda Introduction Satellite Image DataDescription Cloud Classification Segmentation Apache Beam Beam Inference Pipeline Demo Future Work 2
  • 3.
    Goal: Identify Tulipfields from Sentinel-2Goal: Identify Tulip fields from Sentinel-2 satellite imagessatellite images 3
  • 4.
  • 5.
    Data: Sentinel-2Data: Sentinel-2 Earthobservation mission from ESAEarth observation mission from ESA 13 spectral bands, from RGB to SWIR (Short Wave13 spectral bands, from RGB to SWIR (Short Wave Infrared)Infrared) Spatial resolution: 10m/px (RGB bands)Spatial resolution: 10m/px (RGB bands) 5 day revisit time5 day revisit time Free and open data policyFree and open data policy 5
  • 6.
    Data acquisitionData acquisition Imagesdownloaded using Sentinel Hub’s WMS (webImages downloaded using Sentinel Hub’s WMS (web mapping service)mapping service) Download tool from Matthieu Guillaumin (@mguillau)Download tool from Matthieu Guillaumin (@mguillau) 6
  • 7.
    DataData 256 x 256px images, RGB256 x 256 px images, RGB 7
  • 8.
  • 9.
    Filter CloudsFilter Clouds Needto remove cloudy images before segmentingNeed to remove cloudy images before segmenting Approach: train a Neural Network to classify images asApproach: train a Neural Network to classify images as clear or cloudyclear or cloudy CNN Architectures: ResNet50 and ResNet101CNN Architectures: ResNet50 and ResNet101 9
  • 10.
    ResNet building blockResNetbuilding block 10
  • 11.
    Filter Clouds: trainingdataFilter Clouds: training data ‘Planet: Understanding the Amazon from Space’ Kaggle‘Planet: Understanding the Amazon from Space’ Kaggle competitioncompetition 40K images labeled as clear, hazy, partly cloudy or40K images labeled as clear, hazy, partly cloudy or cloudycloudy 11
  • 12.
    Filter Clouds: Trainingdata(2)Filter Clouds: Training data(2) Origin No. of Images Cloudy Images Kaggle Competition 40000 30% Sentinel-2(hand labelled) 5000 50% Total 45000 32% Only two classes: clear and cloudy (cloudy = haze +Only two classes: clear and cloudy (cloudy = haze + partly cloudy + cloudy)partly cloudy + cloudy) 12
  • 13.
  • 14.
    ResultsResults Model Accuracy F1Epochs (train + finetune) ResNet50 0.983 0.986 23 + 7 ResNet101 0.978 0.982 43 + 9 Choose ResNet50 for filtering cloudy imagesChoose ResNet50 for filtering cloudy images 14
  • 15.
  • 16.
    Data AugmentationData Augmentation importAugmentor p = Augmentor.Pipeline(img_dir) p.skew(probability=0.5, magnitude=0.5) p.shear(probability=0.3, max_shear=15) p.flip_left_right(probability=0.5) p.flip_top_bottom(probability=0.5) p.rotate_random_90(probability=0.75) p.rotate(probability=0.75, max_rotation=20) 16
  • 17.
  • 18.
  • 19.
  • 20.
    Approach U-NetApproach U-Net Stateof the Art CNN for Image Segmentation Commonly used with biomedical images Best Architecture for tasks like this O. Ronneberger, P.Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. arxiv:1505.04597, 2015O. Ronneberger, P.Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. arxiv:1505.04597, 2015 20
  • 21.
  • 22.
    U-Net Building BlocksU-NetBuilding Blocks def conv_block(channels, kernel_size): out = nn.HybridSequential() out.add( nn.Conv2D(channels, kernel_size, padding=1, use_bias=False nn.BatchNorm(), nn.Activation('relu') ) return out def down_block(channels): out = nn.HybridSequential() out.add( conv_block(channels, 3), conv_block(channels, 3) ) return out 22
  • 23.
    U-Net Building Blocks(2)U-Net Building Blocks (2) class up_block(nn.HybridBlock): def __init__(self, channels, shrink=True, **kwargs): super(up_block, self).__init__(**kwargs) self.upsampler = nn.Conv2DTranspose(channels=channels, ker strides=2, padding=1, self.conv1 = conv_block(channels, 1) self.conv3_0 = conv_block(channels, 3) if shrink: self.conv3_1 = conv_block(int(channels/2), 3) else: self.conv3_1 = conv_block(channels, 3) def hybrid_forward(self, F, x, s): x = self.upsampler(x) x = self.conv1(x) x = F.relu(x) x = F.Crop(*[x,s], center crop=True) 23
  • 24.
    U-Net: Training dataU-Net:Training data Ground truth: tulip fields in the Netherlands Provided by Geopedia, from Sinergise 24
  • 25.
    Loss function: SoftDice Coefficient lossLoss function: Soft Dice Coefficient loss Prediction = Probability of each pixel belonging to aPrediction = Probability of each pixel belonging to a Tulip Field (Softmax output)Tulip Field (Softmax output) ε serves to prevent division by zeroε serves to prevent division by zero 25
  • 26.
    Evaluation Metric: IntersectionOver Union(IoU)Evaluation Metric: Intersection Over Union(IoU) AkaAka Jaccard IndexJaccard Index Similar to Dice coefficient, standard metric for imageSimilar to Dice coefficient, standard metric for image segmentationsegmentation 26
  • 27.
    ResultsResults IoU = 0.73after 23 training epochs Related results: DSTL Kaggle competition IoU = 0.84 on crop vs building/road/water/etc segmentation https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection/discussion/29790https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection/discussion/29790 27
  • 28.
    Was ist ApacheBeam?Was ist Apache Beam? Agnostic (unified Batch + Stream) programming model Java, Python, Go SDKs Runners for Dataflow Apache Flink Apache Spark Google Cloud Dataflow Local DataRunner 28
  • 29.
    Warum Apache Beam?WarumApache Beam? Portierbar: Code abstraction that can be executed on different backend runners Vereinheitlicht: Unified batch and Streaming API Erweiterbare Modelle und SDK: Extensible API to define custom sinks and sources 29
  • 30.
    End Users: Create pipelinesin a familiar language SDK Writers: Make Beam concepts available in new languages Runner Writers: Support Beam pipelines in distributed processing environments Die Apache Beam VisionDie Apache Beam Vision 30
  • 31.
  • 32.
    Beam Inference PipelineBeamInference Pipeline pipeline_options = PipelineOptions(pipeline_args) pipeline_options.view_as(SetupOptions).save_main_session = True pipeline_options.view_as(StandardOptions).streaming = True with beam.Pipeline(options=pipeline_options) as p: filtered_images = (p | "Read Images" >> beam.Create(glob.glob | "Batch elements" >> beam.BatchElements(0, known_args.batchs | "Filter Cloudy images" >> beam.ParDo(FilterCloudyFn.FilterC filtered_images | "Segment for Land use" >> beam.ParDo(UNetInference.UNetInferenceFn(known_args.m 32
  • 33.
    Cloud Classifier DoFnCloudClassifier DoFn class FilterCloudyFn(apache_beam.DoFn): def process(self, element): """ Returns clear images after filtering the cloudy ones :param element: :return: """ clear_images = [] batch = self.load_batch(element) batch = batch.as_in_context(self.ctx) preds = mx.nd.argmax(self.net(batch), axis=1) idxs = np.arange(len(element))[preds.asnumpy() == 0] clear_images.extend([element[i] for i in idxs]) yield clear_images 33
  • 34.
    U-Net Segmentation DoFnU-NetSegmentation DoFn class UNetInferenceFn(apache_beam.DoFn): def save_batch(self, filenames, predictions): for idx, fn in enumerate(filenames): base, ext = os.path.splitext(os.path.basename(fn)) mask_name = base + "_predicted_mask" + ext imsave(os.path.join(self.output, mask_name) , predict 34
  • 35.
  • 36.
    No Tulip FieldsNoTulip Fields 36
  • 37.
    Large Tulip FieldsLargeTulip Fields 37
  • 38.
    Small Tulips FieldsSmallTulips Fields 38
  • 39.
  • 40.
    Classify Rock FormationsClassifyRock Formations Using Shortwave Infrared images (2.107 - 2.294 nm)Using Shortwave Infrared images (2.107 - 2.294 nm) Radiant Energy reflected/transmitted per unit timeRadiant Energy reflected/transmitted per unit time (Radiant Flux)(Radiant Flux) Eg: Plants don't grow on rocksEg: Plants don't grow on rocks https://en.wikipedia.org/wiki/Radiant_fluxhttps://en.wikipedia.org/wiki/Radiant_flux 40
  • 41.
    Measure Crop HealthMeasureCrop Health Using Near-Infrared (NIR) radiationUsing Near-Infrared (NIR) radiation Emitted by plant Chlorophyll and MesophyllEmitted by plant Chlorophyll and Mesophyll Chlorophyll content differs between plants and plantChlorophyll content differs between plants and plant stagesstages Good measure to identify different plants and theirGood measure to identify different plants and their healthhealth https://en.wikipedia.org/wiki/Near-infrared_spectroscopy#Agriculturehttps://en.wikipedia.org/wiki/Near-infrared_spectroscopy#Agriculture 41
  • 42.
    Use images fromRed bandUse images from Red band Identify borders, regions without much details withIdentify borders, regions without much details with naked eye - Wonder Why?naked eye - Wonder Why? Images are in RedbandImages are in Redband Unsupervised Learning - ClusteringUnsupervised Learning - Clustering 42
  • 43.
    CreditsCredits Jose Contreras, MatthieuGuillaumin, Kellen Sunderland (Amazon - Berlin) Ali Abbas (HERE - Frankfurt) Apache Beam: Pablo Estrada, Lukasz Cwik, Sergei Sokolenko (Google) Pascal Hahn, Jed Sundvall (Amazon - Germany) Apache OpenNLP: Bruno Kinoshita, Joern Kottmann Stevo Slavic (SAP - Munich) 43
  • 44.
    LinksLinks Earth on AWS:https://aws.amazon.com/earth/ Semantic Segmentation - U-Net: https://medium.com/@keremturgutlu/semantic- segmentation-u-net-part-1-d8d6f6005066 ResNet: https://arxiv.org/pdf/1512.03385.pdf U-Net: https://arxiv.org/pdf/1505.04597.pdf 44
  • 45.
    Links (contd)Links (contd) ApacheBeam: https://beam.apache.org Slides: https://smarthi.github.io/BBuzz18-Satellite- image-classification-for-landuse Code: https://github.com/smarthi/satellite-images 45
  • 46.