I N T R O
All-hands Knowledge-sharing Lunch!
• This month's session is about Applied Machine Learning (ML) - a test
personal project I am working on, the reasons thereof, and the technology
underneath.
• The project uses APIs from Cloud vendors to sift through satellite images.
• The goal today is to start a discussion around Emerging Tech at NASA.
Applied ML - Harsh Prakash 1
Applied ML - Harsh Prakash 2
W H A T I S M L ?
• ML, a subset of AI and a superset of DL, enables a user to perform
specific tasks, like predicting outcomes and recognizing images, without
explicit instructions by analyzing and learning from data based on
patterns and inference, and with minimal human intervention.
• NLP, a subset of AI, helps a user read, analyze, interpret and understand
natural language data, and perform speech recognition.
• AI helps a user’s computer systems learn (acquire data and the rules
governing its use), reason (reach conclusions), problem-solve and self-
correct to inform its decisions.
• Neural Network is at the heart - Designed to recognize patterns (variables
that rise and fall together).
L I V E D E M O *
• Web app on Apache uses AWS SDK for Rekognition API connected to a video
camera for near real-time image analysis.
• ML assigns LABELS/TAGS, and returns raw JSON response from the Model API.
• Can adjust MAXLABELS, MINCONFIDENCE, etc., be ported to Lambda/S3 Bucket,
and send alerts.
* Service currently available in AWS GovCloud (US-West) only.
Applied ML - Harsh Prakash 3
Applied ML - Harsh Prakash 4
B E S T U S E ?
Applied ML - Harsh Prakash 5
COLLEGE PROJECT *
Growth Study for Charlottesville VA, 2000-2030
Annual Scholarship, 2001
Used satellite images and Census data to compute population growth
distribution –
• Divided study area of the county into 5,745 grid cells (250 meters x 250
meters).
• Traditional compute model assigned growth weights based on development
indicators at the neighborhood level.
* https://www.slideshare.net/gisblog/gis-growth-study-for-charlottesville-va-20002030-plan-885-vamlis-2001-38716260
Development Indicator
TEST PROJECT *
• As volunteer Directors, our focus is on mapping poverty hotspots.
• Using Cloud-based ML model with satellite images to detect development
indicators at the neighborhood level.
* https://www.globalmapaid.org/patron-directors/
Applied ML - Harsh Prakash 6
STEPS
1. Opened account with Google Cloud Platform (GCP).
2. Enabled Google Maps API for project.
3. Enabled billing for project to fetch more than 1 satellite image per day
using API key.
4. Tuning model for known test areas. E.g. New York...
• Using satellite images for Ethiopia’s capital, Addis Ababa, from Google
Maps API at their highest available resolution (zoom: 17, or 1x1 sq.
mile).
• Using Cloud-based ML model to classify satellite images by infrastructure
levels.
• Assuming correlation between infrastructure and visual indicators in
satellite images.
Applied ML - Harsh Prakash 7
Bridge – New York City, NY
ML assigns labels:
Nature, Outdoors, Landscape, Scenery
Applied ML - Harsh Prakash 8
City Center – New York
City, NY
ML assigns labels:
Outdoors, Nature,
Landscape, Scenery,
Urban, Building,
Neighborhood, Road,
Housing, City, Town,
Intersection
Rural Town of Cazenovia, NY
ML assigns labels:
Landscape, Outdoors, Nature,
Scenery, Aerial View, Land, Urban,
Road, Housing, Building, Yard,
Neighborhood
KNOWN
TEST
AREAS
FINDINGS FROM KNOWN TEST AREAS
• For the City Center in New York City, NY – ML assigns labels “Urban” with
a 94% confidence. For the rural Town of Cazenovia, NY – ML assigns labels
“Urban” with a 76% confidence: A typical gap of about 15% points between
True Positive (TP) and False Positive (FP).
• Hybrid, Roadmap and Terrain images add noise.
• Real world applicability – If it reinforces what people on the ground
already know, it would be really helpful to Global MapAid donors and
volunteers. Applied ML - Harsh Prakash 9
Urban
Rural
City of Addis Ababa
TODO
• Use other datasets to augment data for BI applications. E.g. Census, IRS,
web searches, survey data from USAID and World Bank, etc.
• Use K-Nearest neighbors algorithm (k-NN) for pattern recognition to
predict for blind spots, and transform ML labels to vector.
• Use Cloud-based ML to identify patterns early and predict natural
disasters using weather data, food data and agricultural data.
• If ground volunteers or local mining companies confirm charcoal fires
and/or cooking burners on satellite images, then tune model further.Applied ML - Harsh Prakash 10
Regression for website visitor profile
using Census data
Automatic clustering of popular searches
on medlineplus.gov for May, 2015, using R
STAT, PostGIS
Applied ML - Harsh Prakash 11
POTENTIAL AT NASA
• ML and geoanalytics to explore LANDSAT data, and satellite and HELIOS
images –
• Modeling, Analysis and Prediction (MAP) Program – Black Marble maps of
night lights to gain insight on human activity.
• Auto-tagging of media – image, audio and video. E.g. Training videos.
• Log and text analyses.
• Smarter storage. E.g. S3 Intelligent Tiering.
• Solar storms.
Applied ML - Harsh Prakash 12
POTENTIAL AT NASA
Solar Storm
ML assigns labels:
Nature, Flare, Light,
Outdoors, Sun, Sky,
Night, Astronomy,
Universe, Outer Space,
Space, Moon, Sunrise,
Mountain, Planet
Solar Storm
ML assigns labels:
Night, Nature, Space,
Outdoors, Universe, Moon,
Astronomy, Outer Space,
Sun, Sky, Flare, Light,
Mountain, Photo,
Photography
NEXT STEPS
• Model as a Service – ML Models on AWS Marketplace.
• Frameworks and Tools – Rekognition, Google Vision, Microsoft Computer
Vision, TensorFlow, PyTorch, Jupyter Notebook, AWS SageMaker, R STAT.
• Questions?
Applied ML - Harsh Prakash 13
This
presentation’s
word cloud

Applied ML (Machine Learning)

  • 1.
    I N TR O All-hands Knowledge-sharing Lunch! • This month's session is about Applied Machine Learning (ML) - a test personal project I am working on, the reasons thereof, and the technology underneath. • The project uses APIs from Cloud vendors to sift through satellite images. • The goal today is to start a discussion around Emerging Tech at NASA. Applied ML - Harsh Prakash 1
  • 2.
    Applied ML -Harsh Prakash 2 W H A T I S M L ? • ML, a subset of AI and a superset of DL, enables a user to perform specific tasks, like predicting outcomes and recognizing images, without explicit instructions by analyzing and learning from data based on patterns and inference, and with minimal human intervention. • NLP, a subset of AI, helps a user read, analyze, interpret and understand natural language data, and perform speech recognition. • AI helps a user’s computer systems learn (acquire data and the rules governing its use), reason (reach conclusions), problem-solve and self- correct to inform its decisions. • Neural Network is at the heart - Designed to recognize patterns (variables that rise and fall together).
  • 3.
    L I VE D E M O * • Web app on Apache uses AWS SDK for Rekognition API connected to a video camera for near real-time image analysis. • ML assigns LABELS/TAGS, and returns raw JSON response from the Model API. • Can adjust MAXLABELS, MINCONFIDENCE, etc., be ported to Lambda/S3 Bucket, and send alerts. * Service currently available in AWS GovCloud (US-West) only. Applied ML - Harsh Prakash 3
  • 4.
    Applied ML -Harsh Prakash 4 B E S T U S E ?
  • 5.
    Applied ML -Harsh Prakash 5 COLLEGE PROJECT * Growth Study for Charlottesville VA, 2000-2030 Annual Scholarship, 2001 Used satellite images and Census data to compute population growth distribution – • Divided study area of the county into 5,745 grid cells (250 meters x 250 meters). • Traditional compute model assigned growth weights based on development indicators at the neighborhood level. * https://www.slideshare.net/gisblog/gis-growth-study-for-charlottesville-va-20002030-plan-885-vamlis-2001-38716260 Development Indicator
  • 6.
    TEST PROJECT * •As volunteer Directors, our focus is on mapping poverty hotspots. • Using Cloud-based ML model with satellite images to detect development indicators at the neighborhood level. * https://www.globalmapaid.org/patron-directors/ Applied ML - Harsh Prakash 6
  • 7.
    STEPS 1. Opened accountwith Google Cloud Platform (GCP). 2. Enabled Google Maps API for project. 3. Enabled billing for project to fetch more than 1 satellite image per day using API key. 4. Tuning model for known test areas. E.g. New York... • Using satellite images for Ethiopia’s capital, Addis Ababa, from Google Maps API at their highest available resolution (zoom: 17, or 1x1 sq. mile). • Using Cloud-based ML model to classify satellite images by infrastructure levels. • Assuming correlation between infrastructure and visual indicators in satellite images. Applied ML - Harsh Prakash 7
  • 8.
    Bridge – NewYork City, NY ML assigns labels: Nature, Outdoors, Landscape, Scenery Applied ML - Harsh Prakash 8 City Center – New York City, NY ML assigns labels: Outdoors, Nature, Landscape, Scenery, Urban, Building, Neighborhood, Road, Housing, City, Town, Intersection Rural Town of Cazenovia, NY ML assigns labels: Landscape, Outdoors, Nature, Scenery, Aerial View, Land, Urban, Road, Housing, Building, Yard, Neighborhood KNOWN TEST AREAS
  • 9.
    FINDINGS FROM KNOWNTEST AREAS • For the City Center in New York City, NY – ML assigns labels “Urban” with a 94% confidence. For the rural Town of Cazenovia, NY – ML assigns labels “Urban” with a 76% confidence: A typical gap of about 15% points between True Positive (TP) and False Positive (FP). • Hybrid, Roadmap and Terrain images add noise. • Real world applicability – If it reinforces what people on the ground already know, it would be really helpful to Global MapAid donors and volunteers. Applied ML - Harsh Prakash 9 Urban Rural City of Addis Ababa
  • 10.
    TODO • Use otherdatasets to augment data for BI applications. E.g. Census, IRS, web searches, survey data from USAID and World Bank, etc. • Use K-Nearest neighbors algorithm (k-NN) for pattern recognition to predict for blind spots, and transform ML labels to vector. • Use Cloud-based ML to identify patterns early and predict natural disasters using weather data, food data and agricultural data. • If ground volunteers or local mining companies confirm charcoal fires and/or cooking burners on satellite images, then tune model further.Applied ML - Harsh Prakash 10 Regression for website visitor profile using Census data Automatic clustering of popular searches on medlineplus.gov for May, 2015, using R STAT, PostGIS
  • 11.
    Applied ML -Harsh Prakash 11 POTENTIAL AT NASA • ML and geoanalytics to explore LANDSAT data, and satellite and HELIOS images – • Modeling, Analysis and Prediction (MAP) Program – Black Marble maps of night lights to gain insight on human activity. • Auto-tagging of media – image, audio and video. E.g. Training videos. • Log and text analyses. • Smarter storage. E.g. S3 Intelligent Tiering. • Solar storms.
  • 12.
    Applied ML -Harsh Prakash 12 POTENTIAL AT NASA Solar Storm ML assigns labels: Nature, Flare, Light, Outdoors, Sun, Sky, Night, Astronomy, Universe, Outer Space, Space, Moon, Sunrise, Mountain, Planet Solar Storm ML assigns labels: Night, Nature, Space, Outdoors, Universe, Moon, Astronomy, Outer Space, Sun, Sky, Flare, Light, Mountain, Photo, Photography
  • 13.
    NEXT STEPS • Modelas a Service – ML Models on AWS Marketplace. • Frameworks and Tools – Rekognition, Google Vision, Microsoft Computer Vision, TensorFlow, PyTorch, Jupyter Notebook, AWS SageMaker, R STAT. • Questions? Applied ML - Harsh Prakash 13 This presentation’s word cloud