ExtremeEarth
From Copernicus Big Data
to Extreme Earth Analytics
This project has received funding from the European Union’s Horizon 2020
research and innovation programme under grant agreement No 825258.
Frascati - September 12th 2019
Sina Sheikholeslami, KTH
Theofilos Kakantousis, Logical Clocks
Scalable Deep Learning Techniques for
Copernicus Data
Phi-Week 2019
3
Agenda
• What is Hopsworks
• End-to-end Machine Learning Pipelines
• Deep Learning Pipelines in Hopsworks
• Scalable Distributed Deep Learning Techniques
• Deep Learning Techniques for Earth Observation data
• Summary
What is Hopsworks
6
What is Hopsworks
7
What is Hopsworks
8
Hopsworks REST API
• Manage resources via the REST API
o Projects
o Datasets
o Jobs
o FeatureStore
o Experiments
o ModelServing
o Kafka
o ...
End-to-end Machine
Learning & Deep
Learning Pipelines
10
End-to-End ML & DL Pipelines can be factored into
stages
Data
Prep
Data
Ingest
Train Serve
Online
Monitor
Distributed Storage
Raw
Data
Data
Lake
Resource Manager
11
End-to-End ML & DL Pipelines in Hopsworks
12
Typical Feature Store pipeline
Deep Learning
Pipelines in Hopsworks
14
Data Access - Copernicus EO data
15
Data Access - Copernicus EO data
16
Feature Engineering
17
Feature Engineering
18
Experiments Overview - Hopsworks UI
19
Experiments Details - Hopsworks UI
20
Experiments - Hopsworks API
21
Experiments - Tensorboard
22
TensorFlow Extended with Beam Portability
Framework and Flink runner
23
TensorFlow Model Analysis
https://medium.com/tensorflow/introducing-tensorflow-model-analysis-scaleable-sliced-and-full-pass-metrics-5cde7baf0b7b
24
TensorFlow Model Analysis
https://medium.com/tensorflow/introducing-tensorflow-model-analysis-scaleable-sliced-and-full-pass-metrics-5cde7baf0b7b
25
Data Scientist Dev View - Jupyter Dashboard
26
Dev view - Python first
No need to write
Dockerfiles
Hopsworks Cluster
27
Dev View - Orchestration
28
Orchestration with Airflow
Scalable Distributed
Deep Learning
30
Distributed Deep Learning in Hopsworks
Executor 1 Executor N
Driver
HopsFS (HDFS)TensorBoard Model Serving
31
Data Parallel Distributed Training
Training Time
Generalization
Error
(Synchronous Stochastic Gradient Descent (SGD))
32
Distributed Deep Learning in Hopsworks
HopsYARN
10 GPUs on 1 host
100 GPUs on 10 hosts with ‘Infiniband’
Hopsworks supports a Hetrogenous Mix of GPUs
4 GPUs on any host
33
Ring-AllReduce vs Parameter Server
GPU 0
GPU 1
GPU 2
GPU 3
send
send
send
send
recv
recv
recv
recv GPU 1 GPU 2 GPU 3 GPU 4
Param Server(s)
Network Bandwidth is the Bottleneck for Distributed Training
34
Distributed Deep Learning in Hopsworks
• Uses Apache Spark/YARN to add distribution
to TensorFlow’s CollectiveAllReduceStrategy
– Automatically builds the ring
(Spark/YARN)
– Allocates GPUs to Spark Executors
Scalable Deep
Learning Techniques
for Satellite Data
36
Polar Use Case EE Ambitions - I
• Deep Learning Architecture
o Now: Most of the deep architectures developed for classifying remote sensing data focus on VHR optical images. The existing
pretrained networks are not optimized to the specific properties of Copernicus data (SAR Images)
o KPI: The new deep architecture will improve the classification accuracy of Copernicus data with respect to the use state-of-the-
art classifiers
• Sea Ice Charts
o Now: Sea ice charts are based on trained ice analysts manually segmenting a combination of SAR and other images into smaller
polygons to record parameters including, but not limited to, sea ice concentration and stage of development.
o KPI: Automatic production of more accurate and reliable sea ice products on the Hops data platform
• Distributed Deep Learning:
o Now: No work currently exists in the international literature for scaleout deep learning for Remote Sensing data.
o KPI: Classification algorithms running on Hops and scaling to PBs of Copernicus data.
37
Polar Use Case EE Ambitions - II
• Ice Charts
o Now: Ice charts are currently manual interpretations with inherent human quality control but also potential for bias.
o KPI: Methods must be robust and reliable throughout the sea ice seasonal cycle
• Computing power
o Now: New ice mapping algorithms currently run only for limited areas and seasons, using limited local processing resources.
o KPI: The Hops data platform is able to cope with input data volume and processing load to enable NRT product availability.
• Operational Delivery
o Now: New automatic ice information products not available to end users
o KPI: Demonstrate availability of new automatic ice mapping products with demo users
38
Polar Use Case EE Challenges
• Training data
o Need of enough and accurate training data
o Gathering high-quality ground truth is expensive and sometimes not feasible
• Resolution
o Each pixel represents a large area of land (100m x 100m)
o Accurate characterization can be jeopardized by misclassification of even single pixels
• Size of SAR Images
o The size of each image usually is 1+ GB
o Many pixels or patches should be considered in training and inference
• Porting and storing data to Hadoop
o Data must be prepared in efficient data structures
o Appropriate data format for storage of training data is required
39
Preprocessing
Calibration
Speckle Nosie
Terrain Correction
Feature
Extraction
Classification
Segmentation
Non-Linear Function
Feature Learning by Deep
Learning
Sea ice Edges
Sea Ice Types -
concentration
Iceberg Detection
Application Road Map
40
Deep Learning
Transfer Learning
Ad hoc Architecture
Distributed Training of Existing
Architecture
Designing Specific Distributed
Architecture for Remote Sensing
Pixel-Wised VGG16 from Scratch
103,754,000 Patch has been analyzed
to Label each pixel
Patch-Based VGG16 Fine-Tune
32x32 Patches
Sea ice Edges
Patch-Based Ad hoc network
32X32 Patches
Semi-Supervised Distributed
Training (GANs)
• All tests have been performed on Hopsworks
• High Resolution and Pixel-wise processing requires
analyzing 100+M patches
- approx. 9hour/image !
• New techniques for distributed training is needed in
the remote sensing domain
S. Khaleghian, T. Kræmer, T. Eltoft, A. Marinoni, “Distributed Deep learning for sea ice edges detection”, in prep.
45
Summary
• End-to-end Machine Learning pipelines with Hopsworks
• Scalable Deep Learning techniques
• Deep Learning techniques for Earth Observation data
Hopsworks Contributors
Jim Dowling, Seif Haridi, Gautier Berthou, Salman Niazi, Mahmoud Ismail, Theofilos Kakantousis, Ermias
Gebremeskel, Fabio Buso, Antonios Kouzoupis, Kim Hammar, Steffen Grohsschmiedt, Alex Ormenisan,
Robin Andersson, Moritz Meister, Kajetan Maliszewski, Netsanet Gebretsadkan Kidane, Sina Sheikholeslami,
Joel Stenkvist, August Bonds, Vasileios Giannokostas, Johan Svedlund Nordström, Rizvi Hasan, Paul Mälzer,
Bram Leenders, Juan Roca, Misganu Dessalegn, K “Sri” Srijeyanthan, Jude D’Souza, Alberto Lorente, Andre
Moré, Ali Gholami, Davis Jaunzems, Stig Viaene, Hooman Peiro, Evangelos Savvidis, Qi Qi, ...
@hopsworks
Thank you!
@logicalclocks
www.logicalclocks.com
sinash@kth.se
theo@logicalclocks.com

Scalable Deep Learning in ExtremeEarth-phiweek19

  • 1.
    ExtremeEarth From Copernicus BigData to Extreme Earth Analytics This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 825258.
  • 2.
    Frascati - September12th 2019 Sina Sheikholeslami, KTH Theofilos Kakantousis, Logical Clocks Scalable Deep Learning Techniques for Copernicus Data Phi-Week 2019
  • 3.
    3 Agenda • What isHopsworks • End-to-end Machine Learning Pipelines • Deep Learning Pipelines in Hopsworks • Scalable Distributed Deep Learning Techniques • Deep Learning Techniques for Earth Observation data • Summary
  • 4.
  • 5.
  • 6.
  • 7.
    8 Hopsworks REST API •Manage resources via the REST API o Projects o Datasets o Jobs o FeatureStore o Experiments o ModelServing o Kafka o ...
  • 8.
    End-to-end Machine Learning &Deep Learning Pipelines
  • 9.
    10 End-to-End ML &DL Pipelines can be factored into stages Data Prep Data Ingest Train Serve Online Monitor Distributed Storage Raw Data Data Lake Resource Manager
  • 10.
    11 End-to-End ML &DL Pipelines in Hopsworks
  • 11.
  • 12.
  • 13.
    14 Data Access -Copernicus EO data
  • 14.
    15 Data Access -Copernicus EO data
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
    22 TensorFlow Extended withBeam Portability Framework and Flink runner
  • 22.
  • 23.
  • 24.
    25 Data Scientist DevView - Jupyter Dashboard
  • 25.
    26 Dev view -Python first No need to write Dockerfiles Hopsworks Cluster
  • 26.
    27 Dev View -Orchestration
  • 27.
  • 28.
  • 29.
    30 Distributed Deep Learningin Hopsworks Executor 1 Executor N Driver HopsFS (HDFS)TensorBoard Model Serving
  • 30.
    31 Data Parallel DistributedTraining Training Time Generalization Error (Synchronous Stochastic Gradient Descent (SGD))
  • 31.
    32 Distributed Deep Learningin Hopsworks HopsYARN 10 GPUs on 1 host 100 GPUs on 10 hosts with ‘Infiniband’ Hopsworks supports a Hetrogenous Mix of GPUs 4 GPUs on any host
  • 32.
    33 Ring-AllReduce vs ParameterServer GPU 0 GPU 1 GPU 2 GPU 3 send send send send recv recv recv recv GPU 1 GPU 2 GPU 3 GPU 4 Param Server(s) Network Bandwidth is the Bottleneck for Distributed Training
  • 33.
    34 Distributed Deep Learningin Hopsworks • Uses Apache Spark/YARN to add distribution to TensorFlow’s CollectiveAllReduceStrategy – Automatically builds the ring (Spark/YARN) – Allocates GPUs to Spark Executors
  • 34.
  • 35.
    36 Polar Use CaseEE Ambitions - I • Deep Learning Architecture o Now: Most of the deep architectures developed for classifying remote sensing data focus on VHR optical images. The existing pretrained networks are not optimized to the specific properties of Copernicus data (SAR Images) o KPI: The new deep architecture will improve the classification accuracy of Copernicus data with respect to the use state-of-the- art classifiers • Sea Ice Charts o Now: Sea ice charts are based on trained ice analysts manually segmenting a combination of SAR and other images into smaller polygons to record parameters including, but not limited to, sea ice concentration and stage of development. o KPI: Automatic production of more accurate and reliable sea ice products on the Hops data platform • Distributed Deep Learning: o Now: No work currently exists in the international literature for scaleout deep learning for Remote Sensing data. o KPI: Classification algorithms running on Hops and scaling to PBs of Copernicus data.
  • 36.
    37 Polar Use CaseEE Ambitions - II • Ice Charts o Now: Ice charts are currently manual interpretations with inherent human quality control but also potential for bias. o KPI: Methods must be robust and reliable throughout the sea ice seasonal cycle • Computing power o Now: New ice mapping algorithms currently run only for limited areas and seasons, using limited local processing resources. o KPI: The Hops data platform is able to cope with input data volume and processing load to enable NRT product availability. • Operational Delivery o Now: New automatic ice information products not available to end users o KPI: Demonstrate availability of new automatic ice mapping products with demo users
  • 37.
    38 Polar Use CaseEE Challenges • Training data o Need of enough and accurate training data o Gathering high-quality ground truth is expensive and sometimes not feasible • Resolution o Each pixel represents a large area of land (100m x 100m) o Accurate characterization can be jeopardized by misclassification of even single pixels • Size of SAR Images o The size of each image usually is 1+ GB o Many pixels or patches should be considered in training and inference • Porting and storing data to Hadoop o Data must be prepared in efficient data structures o Appropriate data format for storage of training data is required
  • 38.
    39 Preprocessing Calibration Speckle Nosie Terrain Correction Feature Extraction Classification Segmentation Non-LinearFunction Feature Learning by Deep Learning Sea ice Edges Sea Ice Types - concentration Iceberg Detection Application Road Map
  • 39.
    40 Deep Learning Transfer Learning Adhoc Architecture Distributed Training of Existing Architecture Designing Specific Distributed Architecture for Remote Sensing Pixel-Wised VGG16 from Scratch 103,754,000 Patch has been analyzed to Label each pixel Patch-Based VGG16 Fine-Tune 32x32 Patches Sea ice Edges Patch-Based Ad hoc network 32X32 Patches Semi-Supervised Distributed Training (GANs) • All tests have been performed on Hopsworks • High Resolution and Pixel-wise processing requires analyzing 100+M patches - approx. 9hour/image ! • New techniques for distributed training is needed in the remote sensing domain S. Khaleghian, T. Kræmer, T. Eltoft, A. Marinoni, “Distributed Deep learning for sea ice edges detection”, in prep.
  • 40.
    45 Summary • End-to-end MachineLearning pipelines with Hopsworks • Scalable Deep Learning techniques • Deep Learning techniques for Earth Observation data
  • 41.
    Hopsworks Contributors Jim Dowling,Seif Haridi, Gautier Berthou, Salman Niazi, Mahmoud Ismail, Theofilos Kakantousis, Ermias Gebremeskel, Fabio Buso, Antonios Kouzoupis, Kim Hammar, Steffen Grohsschmiedt, Alex Ormenisan, Robin Andersson, Moritz Meister, Kajetan Maliszewski, Netsanet Gebretsadkan Kidane, Sina Sheikholeslami, Joel Stenkvist, August Bonds, Vasileios Giannokostas, Johan Svedlund Nordström, Rizvi Hasan, Paul Mälzer, Bram Leenders, Juan Roca, Misganu Dessalegn, K “Sri” Srijeyanthan, Jude D’Souza, Alberto Lorente, Andre Moré, Ali Gholami, Davis Jaunzems, Stig Viaene, Hooman Peiro, Evangelos Savvidis, Qi Qi, ... @hopsworks
  • 42.