More Related Content Similar to Machine Learning, Open Data, and the Future WarFighter (20) More from Amazon Web Services (20) Machine Learning, Open Data, and the Future WarFighter1. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Jed Sundwall
Global Open Data Lead, Public Sector, Amazon Web Services
Balaji Iyer
Machine Learning Business Development Lead, Public Sector, Amazon Web Services
Machine Learning, Open Data, and
the Future WarFighter
2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Open Data on AWS
3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Why does AWS care about Open Data?
Many of our public sector
customers are required to make
their data available to the public.
Sharing data on AWS makes it accessible to a large and growing community of
researchers, entrepreneurs, and enterprises who use the AWS Cloud.
Many of our commercial sector
customers rely on access to open
data to develop their products.
4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The AWS Open Data program
expands access to data by
staging it for analysis in the
cloud.
https://opendata.aws
5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Public Datasets
https://registry.opendata.aws
6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
“No matter who you are, most of
the smartest people work for
someone else.”
— Bill Joy, co-founder of Sun Microsystems
7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Traditional Data Acquisition
“…data must be organized, well-documented, consistently formatted,
and error free. Cleaning the data is often the most taxing part of data
science, and is frequently 80% of the work.”
— Data Driven by DJ Patil and Hilary Mason
8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data Acquisition in the Cloud
Amazon S3
Amazon EC2
Amazon Athena Amazon EMR
Amazon Redshift
AWS Glue AWS Data Pipeline
9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Undifferentiated Heavy Lifting
10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advantages of sharing data in the cloud
Global community of users
Reduced time to insight Lower cost of research
New services and tools
12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Staging data for analysis in the cloud
Amazon S3 allows programmatic
and precise access to data at
planetary scale.
Landsat on AWS uses Cloud
Optimized GeoTIFFs that allow
users to get only the data they
need when they need it.
cogeo.org
AWS Lambda.tarUSGS
Cloud Optimized GeoTIFFs
s3://landsat-pds
13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Landsat-tiler from Mapbox
https://github.com/mapbox/landsat-tiler
14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Graph by Drew Bollinger (@drewbo19) at Development Seed
Landsat on AWS
15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Elevation
Models
Aerial
Imagery
Climate
Models
Satellite
Imagery
High-resolution
Radar
aws.amazon.com/earth
16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Elevation
Models
Aerial
Imagery
Climate
Models
Satellite
Imagery
High-resolution
Radar
aws.amazon.com/earth/research-credits
17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Machine Learning on AWS
18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
ML @ AWS
OUR MISSION
Put machine learning in the
hands of every developer and
data scientist
19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
FRAMEW ORKS / INTERFACES
KERAS
PLATFORMS
APPLICATION SERVICES
R E K O G N I T I O N R E K O G N I T I O N
V I D E O
P O L L Y T R A N S C R I B E T R A N S L A T E C O M P R E H E N D L E X
Amazon SageMaker Amazon Mechanical Turk
The AWS Machine Learning Stack
20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BUI LD
Where should you spend your time?
Design training experiments
21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BUI LD T RAI N
Where should you spend your time?
Run scalable training
22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BUI LD T RAI N
Where should you spend your time?
Tune Hyperparameters
T UNE
23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
BUI LD T RAI N DEPLO Y
Where should you spend your time?
T UNE
Deploy and operate
24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data
Selection
Data
Cleansing
Data
Annotation
Model
Selection
Model
Evaluation
Model
Deployment
Data readiness challenges
25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Pre-built Jupyter
notebooks
Built-in, high
performance
algorithms
T RAI N DEPLO Y
BUI LD
Build, train, tune, and host your own models
T UNE
Amazon SageMaker
26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Fully managed
hosting with auto-
scaling
One-click
deployment
Pre-built
notebooks for
common
problems
Built-in, high
performance
algorithms
One-click
training
BUI LD T RAI N & T UNE DEPLO Y
Build, train, tune, and host your own models
Hyperparameter
optimization
Amazon SageMaker
27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Pre-built
notebooks for
common
problems
Built-in, high
performance
algorithms
One-click
training
Hyperparameter
optimization
DEPLO Y
BUI LD
T RAI N &
T UNE
Build, train, tune, and host your own models
Amazon SageMaker
28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Fully managed
hosting with auto-
scaling
One-click
deployment
Pre-built
notebooks for
common
problems
Built-in, high
performance
algorithms
One-click
training
BUI LD T RAI N & T UNE DEPLO Y
Build, train, tune, and host your own models
Hyperparameter
optimization
End-to-end-encryption
End-to-end VPC support
Compliance and audit capabilities
Metadata and experiment management capabilities
Pay as you go
Amazon SageMaker
29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Remove
undifferentiated
heavy lifting of ML
Iterate, iterate,
iterate!
Training speed Inference speed
Speed in all phases
Amazon SageMaker
30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Collect and
prepare
training data
Choose and
optimize
your ML algorithm
Set up and manage
environments for
training
Train and tune
model
(iterate, iterate)
Scale and manage
the production
environment
Deploy model
in production
Amazon SageMaker
31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customer Success With Amazon SageMaker
32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Machine Learning and Open Data on
AWS
33. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
33
SpaceNet
34. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The SpaceNet Dataset is an open repository of over 5,700+ km2 of satellite
imagery across 5 cities, 520,000+ vectors, and a series of challenges to
accelerate geospatial machine learning.
Automated Mapping
Challenge: Building
Extraction Rounds 1 & 2
Nov. 2016 – Jun. 2017
High Revisit Challenge:
Off-Nadir Object Detection
Launching Spring 2018
Automated Mapping
Challenge:
Road Network Extraction
Nov 2017 – Feb 2018
35. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SpaceNet Challenge Winning Examples
AOI 2 Vegas: Image 1014 AOI 3 Paris: Image 1729 AOI 5 Khartoum: Image 991
Challenge prizes totaling > $100K to date
https://spacenetchallenge.github.io/
36. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Functional Map of the World Challenge
37. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Functional Map of the World Challenge
Managed by the Intelligence Advanced Research Projects Activity (IARPA), the
challenge published more than one million points of interest from over 200
countries. The dataset contains satellite-specific metadata that researchers can
exploit to build a competitive algorithm that classifies facility, building, and land use.
• Challenge wrapped up with ~60 contributors proposing trained models
• Winning solutions have been open-sourced with the goal of pushing the
community forward
• Data will be moving to SpaceNet as a permanent home on AWS
38. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DIUx xView 2018 Detection Challenge
39. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DIUx xView 2018 Detection Challenge
Defense Innovation Unit Experimental (DIUx) and the National Geospatial-
Intelligence Agency (NGA) are releasing a new satellite imagery dataset to advance
key frontiers in computer vision and develop new solutions for national security and
disaster response.
• Seeks to build off of SpaceNet and FMoW to apply computer vision to the
growing amount of available imagery from space so that we can understand the
visual world in new ways and address a range of important applications.
• Four main goal areas: Reduce minimum resolution for detection; Improve
learning efficiency; Enable discovery of more object classes; Improve detection
of fine-grained classes.
• Data is available on AWS and the challenge is currently under way with
$150,000 in prizes.
40. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Please complete the session survey in
the mobile summit app!
41. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!