AWS Media Day- AWS 인공 지능 서비스를 활용한 미디어 서비스 개발화 (김기완 솔루션즈 아키텍트)

김기완 솔루션 아키텍트
AWS 인공지능 서비스를
활용한 미디어 서비스 개발

• 미디어 및 엔터테인먼트 산업에서의 인공 지능 기술의
필요성
• AWS 인공 지능 서비스 소개
• AWS ML Stack
• Vision, Speech, Language
• Deep Learning framework /AmazonSagemaker
• 고객 사례 :조선일보
• 미디어에서의 인공 지능 활용 사례
Agenda

인공 지능 활용의 필요성 – Media &
Entertainment
There are 3,700,000,000 Internet users in 2017*
1,200,000,000 photos will be taken in 2017 (9% YoYgrowth)*
50% of 2016 Internet traffic was video, and will likely be 70% by 2021**
Multi-petabyte asset storage with > 1 PBMoM growth is commonplace onAWS
Sources: * InfoTrends Worldwide, **StreamingMedia.com

미디어 산업에서의 인공 지능
활용
• 메터데이터 활용 (Richer Metadata)
• B2B 및 B2C를 위한 자동화된 메터데이터 생성
• IMDb 활용 및 출연자 정보의 적극 활용
• 디지털 아카이브의 대량 배치 프로세싱
• 사용자 경험 개선 (EnhancedExperiences)
• 지능적인 컨텐트 필터링
• 사용자 경험 개선을 위한 적합한 컨텐트 사용
• 보안 및 분석 (Security andAnalytics)
• UGC (User Generated Content) 컨텐트에 대한 자동화된 필터링
• 워크플로우 개선
• 시청자 참여
Vision, Speech,and Language

Amazon의 인공지능 활
용
Fulfilment
& Logistics
Existing
Products
New
Products
Search
& Discovery

Put machine learning in the hands of
every developer and datascientist
ML @ AWS: Our mission

AWS ML Stack
Frameworks &
Infrastructure
AWS Deep Learning AMI
GPU
(P3 Instances)
MobileCPU IoT (Greengrass)
Vision:
Rekognition Image
Rekognition Video
Speech:
Amazon Polly
Transcribe
Language:
Lex Translate
Comprehend
Apache
MXNet
PyTorch
Cognitive
Toolkit
Keras
Caffe2
& Caffe
TensorFlow Gluon
Application
Services
Platform
Services
Amazon
Machine Learning
Mechanical
Turk
Spark & EMR
Amazon
SageMaker
AWS DeepLens

Vision :Amazon Rekognition
Object and Scene
Detection
Facial
Analysis
Face
Comparison
Facial
Recognition
Celebrity
Recognition
Image
Moderation

<0.5 second response time
Up to 10M faces
Enable Immediate response
New feature :Real Time FaceSearch
Real-time face recognition against tens of millions of faces

How can weapply
these powerful capabilities tovideo?

Frame-based analysis for videos
• AWS Answers
(https://aws.amazon.com/answers/media-
entertainment/video-frame-based-analysis/)
• 서버리스 아키텍쳐 – AWS Lambda, Amazon
DynamoDB, AWS IoT, Amazon SNS, Amazon S3,
Amazon SQS, Amazon Rekognition
• ffmpeg
• Using Live Stream?
• Scalability?
• More features?

Object and Activity
Detection
Person
Tracking
Face
Recognition
Real-time
Live Stream
Content
Moderation
Celebrity
Recognition
New Service :Amazon Rekognition Video
Video Analysis

Object, Scene and ActivityDetection
Blowing acandle Drinking

Person Tracking

Live Streaming FaceRecognition

Activity recognition

One Solution forAll
Stored Video
Amazon S3
Media Search Index
Unsafe Video Detection
Investigative Analysis
Video Live Stream
Amazon Kinesis Video Stream
Public Safety Immediate Response
Home Monitoring

Rekognition Media UseCases
Playout and
Distribution
Filtering and QualityControl
Visual Effectsand
Editing
Application and Filesystem
Texture and AssetSearch
Analytics
Sentiment Analysis
Other Amazon AI Services
(Lex, Polly)
DAM and Archive
Auto-categorization
Metadata Augmentation
Digital Supply Chain
Tag on Ingest
Live and VOD Feature Extraction
Celebrity Detection
Publishing
Value Add
API-Based Services
OTT
Filtering and
QualityControl
Acquisition
Preprocessing and Opti
mization

Speech :Amazon Polly
Convert text into life-likespeech
• 25개국, 52가지 언어 지원
• 한글 포함 (서연)
• 리얼타임 시스템에 사용될 수 있도록 빠른 응답 속
도 지원
• 서울리전 서비스 Endpoint 제공
• 변환된 음성파일은 자유롭게 저장, 재생, 배포될 수
있음
• 별도의 계약 없이 생성된 음원을 무제한 사용

Convert text into life-likespeech
• US English Male(Matthew)
• German Female (Vicki)
• Indian English Female(Aditi)
• Japanese Male (Takumi)
• Korean Female (Seoyeon)

Speech marks to synchronizeAudio-Video

Customer Case : 아마존 폴리가 조선일보 뉴스를 들려드립
니다Echo Alexa Skill - Chosun Flash Briefing (조선일보
)

니다Create a beta service using Amazon Polly
Demo link

니다
Architecture using the AWSserverless services

Time stamps and
confidence scores
Support for both
regular and
telephony audio
Punctuation
§
S3 integration
Hello/
Hola
English and Spanish
with more tocome
Amazon
S3
Speech :New :Amazon Transcribe
Automatic Speech Recognition
Subtitles for VoD, Broadcast Closed Caption, …

Language :Amazon Lex
• 컨택 센터
• 챗봇, 고객 서비스
• 정보 전달/검색 봇
• 고객의 평소 요청에 대응
하는 챗봇
• 어플리케이션 봇
• 모바일 어플리케이션에 강
력한 인터페이스 제공
• 기업 생산성 향상 봇
• 기업 워크플로우 효율성 재
고
• IoT 봇
• 디바이스에 대화 기능 추가

REAL-TIME
TRANSLATION
POWERED BY
DEEP LEARNING
12 LANGUAGE
PAIRS (moreto
come)
LANGUAGE
DETECTION
Language :New :Amazon Translate (Preview)
Real-tiem translation service

Sentiment Entities LanguagesKey phrases Topic modeling
Powered By DeepLearning
Language :New :Amazon Comprehend
Natural Language Processing

Customer Case :Media & Entertainment
Opportunities
• Petabytes of images
• 100+ years ofcontent
• How can we enrich our metadata in AWS?
• How can we unleash the value of contentwe
already own once in AWS?

Challenges
• Niche Image Categories
• Low & Ultra High Resolutions
• Artifacts & Noise
• Black and White Footage
• Historical Context
• High Accuracy Required

Digital Transformation
AWS Migration
• Storage /Archive
• Editing & Publishing
• Video Streaming
• Web Apps

Object & Scene Detection :AmazonRekognition
Shoe
Ramp
Person
Identify objects, scenes & concepts, and provide confidence scores
Sky
Person
Eagle
Desert
Mountain

Label
Detection
UUID
Generator
{
"FaceMatches": [
{"Face": {"BoundingB
"Height": 0.2683333456516266,
"Left": 0.5099999904632568,
"Top": 0.1783333271741867,
"Width": 0.17888888716697693},
UUID
API Gateway
Lambda(s)
Rekognition
CloudFront
Browser /
API Client
Image
Processing
Step Functions
Realtime
Search
ElasticSearch
Client Lookup
Archive, DAM/MAM, Searching metadata, AI processing on AWS
Delivery
Ingest
Processing
Service
Frontend
Asset
Metadata "
DynamoDB
Metadata
Service
API Gateway
Content
Archive
S3 Image
Storage

Back to the Challenges – Deep learning required
• Custom Concepts
NLP – Rekognition + spaCy, Others
• Specialized Categories
Transfer Learning w/ Finetuning
• Black & White Footage
Deep Learning-based Colorization
• Low Resolutions
Convolutional Neural Net ImageScaling
• Niche & Historical Context
Crowd working & OCR
Real-Time User-Guided Image Colorization with Learned Deep Priors
https://richzhang.github.io/ideepcolor

Deep Learning in the AWS Cloud

AWS ML Stack - revisited
Frameworks &
Infrastructure
AWS Deep Learning AMI
GPU
(P3 Instances)
MobileCPU IoT (Greengrass)
Vision:
Rekognition Image
Rekognition Video
Speech:
Amazon Polly
Transcribe
Language:
Lex Translate
Comprehend
Apache
MXNet
PyTorch
Cognitive
Toolkit
Keras
Caffe2
& Caffe
TensorFlow Gluon
Application
Services
Platform
Services
Amazon
Machine Learning
Mechanical
Turk
Spark & EMR
Amazon
SageMaker
AWS DeepLens

체크 포인
트
• AWS ML (Machine Learning) Stack
• AWS MLApplications :Vision, Speech, Language
• AWS Media Capabilities – 8 Key media workloads
• Metadata Enrichment using AWS ML applications / platform services
• Continuous Update / Refinement isimportant

본 강연이 끝난 후
…• Amazon AI Home Page:
https://aws.amazon.com/blogs/ai/
• Amazon Rekognition Home Page:
https://aws.amazon.com/rekognition
• Amazon Polly Home Page:
https://aws.amazon.com/polly/

AWS Media Day- AWS 인공 지능 서비스를 활용한 미디어 서비스 개발화 (김기완 솔루션즈 아키텍트)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to AWS Media Day- AWS 인공 지능 서비스를 활용한 미디어 서비스 개발화 (김기완 솔루션즈 아키텍트)

Similar to AWS Media Day- AWS 인공 지능 서비스를 활용한 미디어 서비스 개발화 (김기완 솔루션즈 아키텍트) (20)

More from Amazon Web Services Korea

More from Amazon Web Services Korea (20)

Recently uploaded

Recently uploaded (20)

AWS Media Day- AWS 인공 지능 서비스를 활용한 미디어 서비스 개발화 (김기완 솔루션즈 아키텍트)