© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Rekognition
Rich Image Metadata Extraction Powered by Deep Learning
Adrian Hornsby, Technical Evangelist
@adhorn
adhorn@amazon.com
Artificial neural network
Deep Learning Process
Conv 1 Conv 2 Conv n
…
…
Feature Maps
Labrador
Dog
Beach
Outdoors
Softmax
Probability
Fully
Connected
Layer
Ground Truth Generation
Training
The Challenge For Artificial Intelligence: SCALE
Training
Tons of GPUs
Data
The Challenge For Artificial Intelligence: SCALE
PBs of existing data
Training
Tons of GPUs
Prediction
The Challenge For Artificial Intelligence: SCALE
Tons of GPUs and CPUs
Data
PBs of existing data
Training
Tons of GPUs
Amazon AI Ecosystem
Amazon
Rekognition
Object and scene detection
Facial analysis
Face comparison
Celebrity recognition
Image moderation
Amazon Rekognition: Images In, Rich Metadata
Out
Object & Scene Detection
Object & Scene Detection
Facial Analysis
Facial Analysis
Facial Comparison
Facial Comparison
Celebrity Recognition
Celebrity Recognition
Image moderation
Explicit Nudity
Nudity
Graphic Male Nudity
Graphic Female Nudity
Sexual Activity
Partial Nudity
Suggestive
Female Swimwear or Underwear
Male Swimwear or Underwear
Revealing Clothes
Image moderation
Amazon
Rekognition
Amazon Rekognition: supported actions
CompareFaces
CreateCollection
DeleteCollection
DeleteFaces
DetectFaces
DetectLabels
DetectModerationLabels
GetCelebrityInfo
IndexFaces
ListCollections
ListFaces
RecognizeCelebrities
SearchFaces
SearchFacesByImage
Collection
IndexFaces
SearchFacesbyImage
Nearest neighbor
search
FaceID: 4c55926e-69b3-5c80-8c9b-78ea01d30690
Similarity: 97
FaceID: 02e56305-1579-5b39-ba57-9afb0fd8782d
Similarity: 92
FaceID: 02e56305-1579-5b39-ba57-9afb0fd8782d
Similarity: 85
Demo time
Amazon Rekognition
Customers
• Digital Asset Management
• Media and Entertainment
• Travel and Hospitality
• Influencer Marketing
• Systems Integration
• Digital Advertising
• Consumer Storage
• Law Enforcement
• Public Safety
• eCommerce
• Education
Thank you!
@adhorn
adhorn@amazon.com
Amazon Rekognition Pricing
Free Tier: 5000 images processed per month for first 12 months
Image Analysis Tiers
Price per 1000
images processed
First 1 million images processed* per month $1.00
Next 9 million images processed* per month $0.80
Next 90 million images processed* per month $0.60
Over 100 million images processed* per month $0.40
Face Metadata Storage
Price per 1,000 face
metadata stored per month
Face metadata stored $0.01

AWS Rekognition: Rich Image Metadata Extraction Powered by Deep Learning

Editor's Notes

  • #3 Today Ai is everywhere is Amazon From recommendation pages, to fulfillment centers, Prime Air Drones, Amazon Go and Alexa
  • #7 Such systems can be trained from examples, rather than explicitly specified. They have found most use in applications difficult to express in a traditional computer algorithm using ordinary rule-based programming. Neural networks have been used to solve a wide variety of tasks, including computer vision, speech recognition, machine translation and medical diagnosis.
  • #8 Deep learning is part of a broader family of machine learning methods based on learning data representations. An observation (e.g., an image) can be represented in many ways, eg., as a vector of intensity values per pixel, or in a more abstract way as a set of edges, regions of particular shape, etc. Some representations are better than others at simplifying a specific task. A deep neural network (DNN) is an artificial neural network (ANN) with multiple hidden layers between the input and output layers.[4][6] Similar to shallow ANNs, DNNs can model complex non-linear relationships. DNN architectures generate compositional models where the object is expressed as a layered composition of image primitives.[106] The extra layers enable composition of features from lower layers, giving the potential of modeling complex data with fewer units than a similarly performing shallow network.[4]
  • #9 To achieve accurate results on complex Computer Vision tasks such as object and scene detection, face analysis, and face recognition, deep learning systems need to be tuned properly and trained with massive amounts of labeled ground truth data. Sourcing, cleaning, and labeling data accurately is a time-consuming and expensive task. Moreover, training a deep neural network is computationally expensive and often requires custom hardware built using Graphics Processing Units (GPU).
  • #10 And for us, AI is a system or service which can perform tasks that usually require human intelligence, such as visual perception, speech recognition, decision-making or translation
  • #11 So this includes machine learning and deep learning.
  • #12 The training phase and its self learning feature are well adapted to solving informal problems, but here’s the catch: they involve a lot of math operations which tend to grow exponentially as data size increases and as the number of layers increases. This problem is called the “Curse of Dimensionality” and it’s one of the major reasons why neural networks stagnated for decades: there was simply not enough computing power available to run them at scale.
  • #13 Nor was enough data available. Neural networks need a lot of data to learn properly. The more data, the better! Until recently, it was simply not possible to gather and store vast amounts of digital data.
  • #14 Finally, running the models require a tones of GPUs and CPUs. With the advances in cloud computing, all those conditions are now met – planets are finally alligned for AI to become a reality.
  • #15 Amazon Web Services provides a rich ecosystem to help you build smarter applications. In this context, it is worth highlighting the higher level AI services based on deep learning algorithms, like Amazon Rekognition, an image recognition service, Amazon Polly, a text to speech synthesizer, and Amazon Lex, a voice and text chatbot service. We also provide the infrastructure including GPU EC2 instances for fast parallel processing which you can use in combination with any of the popular deep learning libraries like Apache mxnet, Tensorflow, Theano, etc, all of which are available on the AWS deep learning AMI. For your general machine learning purposes, you can also use EC2, Amazon Elastic MapReduce and Spark with SparkML to run any machine learning algorithm. Another popular library is the python scikit-learn, which you can deploy on AWS Lambda or containers, or EC2. So what I am trying to convey is that there is a lot of choice, which basically boils down to picking the right tool for the right job, where you can make trade-offs between ‘do your own’ with all the flexibility, or picking a managed solution which allows you to get results fast without having to do the heavy lifting.
  • #16 Amazon Rekognition currently supports the JPEG and PNG image formats. You can submit images either as an S3 object or as a byte array. Amazon Rekognition supports image file sizes up to 15MB when passed as an S3 object, and up to 5MB when submitted as an image byte array. Amazon Rekognition is currently available in US East (Northern Virginia), US West (Oregon) and EU (Ireland) regions. Mxnet convolutional deep neural networks (CNNs),
  • #20 Image moderation Rekognition automatically detects explicit or suggestive adult content in your images, and provides confidence scores.
  • #25  You can use the ‘MinConfidence’ parameter in your API requests to balance detection of content (recall) vs the accuracy of detection (precision). 
  • #26 Explicit Nudity Nudity Graphic Male Nudity Graphic Female Nudity Sexual Activity Partial Nudity Suggestive  Female Swimwear or Underwear Male Swimwear or Underwear Revealing Clothes
  • #27 Amazon Rekognition currently supports the JPEG and PNG image formats. You can submit images either as an S3 object or as a byte array. Amazon Rekognition supports image file sizes up to 15MB when passed as an S3 object, and up to 5MB when submitted as an image byte array. Amazon Rekognition is currently available in US East (Northern Virginia), US West (Oregon) and EU (Ireland) regions.
  • #33 The C-SPAN Archives is responsible for indexing speakers in C-SPAN video content. In the past, this manual process was performed by a team of two to three indexers. The initial image collection—consisting of 97,000 known individuals—was indexed by Amazon Rekognition in less than two hours.
  • #34 "By using Amazon Rekognition's advanced facial recognition and metadata analysis, we are able to provide more detailed influencer audience metrics to our growing list of customers such as Bell Media, L’Oreal, Samsung, 20th Century Fox, Walmart and many more."
  • #35 Moderation: To determine if an image contains explicit or suggestive adult content
  • #37  Democratizing Image Analysis (key factors: affordable, accessible, scalable) that helps you get started in minutes