2. • 미디어 및 엔터테인먼트 산업에서의 인공 지능 기술의
필요성
• AWS 인공 지능 서비스 소개
• AWS ML Stack
• Vision, Speech, Language
• Deep Learning framework /AmazonSagemaker
• 고객 사례 :조선일보
• 미디어에서의 인공 지능 활용 사례
Agenda
3. 인공 지능 활용의 필요성 – Media &
Entertainment
There are 3,700,000,000 Internet users in 2017*
1,200,000,000 photos will be taken in 2017 (9% YoYgrowth)*
50% of 2016 Internet traffic was video, and will likely be 70% by 2021**
Multi-petabyte asset storage with > 1 PBMoM growth is commonplace onAWS
Sources: * InfoTrends Worldwide, **StreamingMedia.com
4. 미디어 산업에서의 인공 지능
활용
• 메터데이터 활용 (Richer Metadata)
• B2B 및 B2C를 위한 자동화된 메터데이터 생성
• IMDb 활용 및 출연자 정보의 적극 활용
• 디지털 아카이브의 대량 배치 프로세싱
• 사용자 경험 개선 (EnhancedExperiences)
• 지능적인 컨텐트 필터링
• 사용자 경험 개선을 위한 적합한 컨텐트 사용
• 보안 및 분석 (Security andAnalytics)
• UGC (User Generated Content) 컨텐트에 대한 자동화된 필터링
• 워크플로우 개선
• 시청자 참여
Vision, Speech,and Language
10. <0.5 second response time
Up to 10M faces
Enable Immediate response
New feature :Real Time FaceSearch
Real-time face recognition against tens of millions of faces
18. New Service :Amazon Rekognition Video
One Solution forAll
Stored Video
Amazon S3
Media Search Index
Unsafe Video Detection
Investigative Analysis
Video Live Stream
Amazon Kinesis Video Stream
Public Safety Immediate Response
Home Monitoring
19. Rekognition Media UseCases
Playout and
Distribution
Filtering and QualityControl
Visual Effectsand
Editing
Application and Filesystem
Texture and AssetSearch
Analytics
Sentiment Analysis
Other Amazon AI Services
(Lex, Polly)
DAM and Archive
Auto-categorization
Metadata Augmentation
Digital Supply Chain
Tag on Ingest
Live and VOD Feature Extraction
Celebrity Detection
Publishing
Value Add
API-Based Services
OTT
Filtering and
QualityControl
Acquisition
Preprocessing and Opti
mization
21. Speech :Amazon Polly
Convert text into life-likespeech
• 25개국, 52가지 언어 지원
• 한글 포함 (서연)
• 리얼타임 시스템에 사용될 수 있도록 빠른 응답 속
도 지원
• 서울리전 서비스 Endpoint 제공
• 변환된 음성파일은 자유롭게 저장, 재생, 배포될 수
있음
• 별도의 계약 없이 생성된 음원을 무제한 사용
22. Speech :Amazon Polly
Convert text into life-likespeech
• US English Male(Matthew)
• German Female (Vicki)
• Indian English Female(Aditi)
• Japanese Male (Takumi)
• Korean Female (Seoyeon)
24. Customer Case : 아마존 폴리가 조선일보 뉴스를 들려드립
니다Echo Alexa Skill - Chosun Flash Briefing (조선일보
)
25. Customer Case : 아마존 폴리가 조선일보 뉴스를 들려드립
니다Create a beta service using Amazon Polly
Demo link
26. Customer Case : 아마존 폴리가 조선일보 뉴스를 들려드립
니다
Architecture using the AWSserverless services
27. Time stamps and
confidence scores
Support for both
regular and
telephony audio
Punctuation
§
S3 integration
Hello/
Hola
English and Spanish
with more tocome
Amazon
S3
Speech :New :Amazon Transcribe
Automatic Speech Recognition
Subtitles for VoD, Broadcast Closed Caption, …
28. Language :Amazon Lex
• 컨택 센터
• 챗봇, 고객 서비스
• 정보 전달/검색 봇
• 고객의 평소 요청에 대응
하는 챗봇
• 어플리케이션 봇
• 모바일 어플리케이션에 강
력한 인터페이스 제공
• 기업 생산성 향상 봇
• 기업 워크플로우 효율성 재
고
• IoT 봇
• 디바이스에 대화 기능 추가
32. Customer Case :Media & Entertainment
Opportunities
• Petabytes of images
• 100+ years ofcontent
• How can we enrich our metadata in AWS?
• How can we unleash the value of contentwe
already own once in AWS?
33. Customer Case :Media & Entertainment
Challenges
• Niche Image Categories
• Low & Ultra High Resolutions
• Artifacts & Noise
• Black and White Footage
• Historical Context
• High Accuracy Required
34. Customer Case :Media & Entertainment
Digital Transformation
AWS Migration
• Storage /Archive
• Editing & Publishing
• Video Streaming
• Web Apps
35. Customer Case :Media & Entertainment
Object & Scene Detection :AmazonRekognition
Shoe
Ramp
Person
Identify objects, scenes & concepts, and provide confidence scores
Sky
Person
Eagle
Desert
Mountain
36. Customer Case :Media & Entertainment
Label
Detection
UUID
Generator
{
"FaceMatches": [
{"Face": {"BoundingB
"Height": 0.2683333456516266,
"Left": 0.5099999904632568,
"Top": 0.1783333271741867,
"Width": 0.17888888716697693},
UUID
API Gateway
Lambda(s)
Rekognition
CloudFront
Browser /
API Client
Image
Processing
Step Functions
Realtime
Search
ElasticSearch
Client Lookup
Archive, DAM/MAM, Searching metadata, AI processing on AWS
Delivery
Ingest
Processing
Service
Frontend
Asset
Metadata "
DynamoDB
Metadata
Service
API Gateway
Content
Archive
S3 Image
Storage
37. Customer Case :Media & Entertainment
Back to the Challenges – Deep learning required
• Custom Concepts
NLP – Rekognition + spaCy, Others
• Specialized Categories
Transfer Learning w/ Finetuning
• Black & White Footage
Deep Learning-based Colorization
• Low Resolutions
Convolutional Neural Net ImageScaling
• Niche & Historical Context
Crowd working & OCR
Real-Time User-Guided Image Colorization with Learned Deep Priors
https://richzhang.github.io/ideepcolor
39. AWS ML Stack - revisited
Frameworks &
Infrastructure
AWS Deep Learning AMI
GPU
(P3 Instances)
MobileCPU IoT (Greengrass)
Vision:
Rekognition Image
Rekognition Video
Speech:
Amazon Polly
Transcribe
Language:
Lex Translate
Comprehend
Apache
MXNet
PyTorch
Cognitive
Toolkit
Keras
Caffe2
& Caffe
TensorFlow Gluon
Application
Services
Platform
Services
Amazon
Machine Learning
Mechanical
Turk
Spark & EMR
Amazon
SageMaker
AWS DeepLens
40. 체크 포인
트
• AWS ML (Machine Learning) Stack
• AWS MLApplications :Vision, Speech, Language
• AWS Media Capabilities – 8 Key media workloads
• Metadata Enrichment using AWS ML applications / platform services
• Continuous Update / Refinement isimportant
41. 본 강연이 끝난 후
…• Amazon AI Home Page:
https://aws.amazon.com/blogs/ai/
• Amazon Rekognition Home Page:
https://aws.amazon.com/rekognition
• Amazon Polly Home Page:
https://aws.amazon.com/polly/