SlideShare a Scribd company logo
1 of 50
Download to read offline
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tim Sullivan and Ari Bixhorn, Panopto
December 2, 2016
Searching Inside Video at
Petabyte-Scale using Spot
What to Expect from the Session
Primer on inside-video search
Dive into how we use Spot to search video at scale
Overview of our cross-platform architecture
Best practices for scaling Spot Instances elastically
Searching Inside Videos
Video: A Last-mile Problem for Search
30 trillion web pagesEmail and documentsFile system contentsVideo?
3 minutes, 53 seconds
15 - 90 minutes
Title: An Introduction to Network Security
Description: A broad overview of network
security as defined by today’s hybrid
corporate WANs.
Tags: Network security, intrusion detection,
corporate WAN, firewall, authentication
!?
125 words per minute
5,625 words spoken
The network is the entry point to your application. It provides the first gatekeepers that
control access to the various servers in your environment. Servers are protected with
their own operating system gatekeepers, but it is important not to allow them to be
deluged with attacks from the network layer. It is equally important to ensure that network
gatekeepers cannot be replaced or reconfigured by imposters. In a nutshell, network
security involves protecting network devices and the data that they forward.
The basic components of a network, which act as the front-line gatekeepers, are the
router, the firewall, and the switch. An attacker looks for poorly configured network
gatekeepers to exploit. Common vulnerabilities include weak default installation settings,
wide-open access controls, and unpatched devices.
50%
5,625 words spoken
50% have no search value
2,813 words with search value
With10 tags, you’ve
only covered 0.3%
of valuable content
Six Types of Video Content Indexing
1. Manually entered metadata
2. Transcription
3. Automatic Speech Recognition (ASR)
4. Optical Character Recognition (OCR)
5. Slide extraction
6. Viewer notes
Demo – Video Search
What Led Us to Spot?
Our Challenge
2013-01 2014-01 2015-01 2016-01
Running on AWS since 2009
Growing exponentially
Need to index every video – quickly & cost-efficiently
15 years of video (400TB) content uploaded monthly
Need to extract metadata out of 4PB of video
122M unique images have been indexed for OCR
>3TB SOLR index
* Numbers are inclusive of both enterprise and education accounts; numbers do not include on-premises customers
Option 1: On-Demand Amazon EC2 Instances
Hours of Content
$
Budget
Today
Cost-prohibitive to
offer to all
customers
Cost
Enable
ASR/OCR
Content Ingestion
Windows and
Mac Clients
Mobile Apps
Video Capture
Appliance
Remote Capture
Client
Other Ingestion
Content DiscoveryContent Management Content Delivery
Content
Consumption
Transcoding
Editing
Search Indexing
Governance
Option 2: Make Search an Upsell Capability
Analytics
Access Control
Video CMS
Public Hosting
SmartSearch™
Email and Social
Integrations
Search
Federation
Panopto
Streaming
CDN Integration
P2P Streaming
Panopto ECDN
WAN Op
Solutions
Interactive
Player
Panopto Mobile
Audio Podcast
Embedded
Player
Quizzing and
Polls
Option 3: Use Reserved Instances (RIs)
Theoretically would save costs
RIs work best for predictable workloads
30 sec SLA to begin indexing results in spiky demand curve vs. flat line
Upfront Monthly Effective
Hourly
Savings over
On-Demand
On-Demand
Hourly
$0 $213.16 $0.292 30%
$0.42$1304 $75.92 $0.253 40%
$2170 $0.00 $0.248 41%
c3.2xlarge
Option 3: Use Reserved Instances (RIs)
RI
Delayed
Start
WasteWaste
# Instances
t
Option 3: Use Reserved Instances (RIs)
RI
Overspend Overspend Overspend
Waste Waste
# Instances
t
Option 4: Buy Our Own Hardware
Option 5: Spot Instances
Excess EC2 capacity auctioned at steeply discounted prices
Spot Instances can be accessed on demand to meet our variable needs
On-Demand
Instances
Spot Instances added
when bid ≥ market
Pre-configured or custom machine images
Configure security and network access
Choose from instance types and locations
Use static IP endpoints
Attach persistent block storage to instances
Pay fixed price by the hour
On-Demand vs. Spot Instances
Pre-configured or custom machine images
Configure security and network access
Choose from instance types and locations
Use static IP endpoints
Attach persistent block storage to instances
Pay variable by the hour
Hours of Content
$
Budget
Today
On-Demand
Spot
The Spot Auction
Set a bid price (for example, $0.27)
Instance runs while bid ≥ market price
Instances terminate bid < market price
Instances run
Instances terminate
Spot Considerations
Is your workload appropriate for potential volatility?
How to deal with a lack of capacity?
Can you run on a wide range of instance types
(via Spot Fleet)?
Look at historical bid prices for your instance types and
regions to estimate your savings.
Our Implementation
The Importance of Windows to our
Architecture
Single codebase for cloud and on-premises
For on-prem customers, Windows is often a requirement
Windows is therefore critical to our cloud architecture as well
On-Prem Cloud
Panopto Cloud on AWS
Distributed across Availability Zones
Cross-Platform Implementation
Web Servers
App Servers
Database
Speech Recognition
Apache SOLR
Using Auto Scaling Groups
Demand
Running Instances
Using AWS CloudFormation
Define ASGs and auto-scale rules
From On-Demand to Spot
OnDemandLaunchConfig : {
Type : AWS::AutoScaling::LaunchConfiguration
Properties : {
SecurityGroups : { Ref : backendSecurityGrpIds },
IamInstanceProfile : { Ref : BackendEncoders...},
ImageID : { Ref : ami },
InstanceType : { Ref : instanceType },
InstanceMonitoring : false,
AssociatePublicIpAddress : true,
EbsOptimized : { Ref : ebsOptimized },
BlockDeviceMappings : [
{
DeviceName : xvdca
}
]
}
}
SpotLaunchConfig : {
Type : AWS::AutoScaling::LaunchConfiguration
Condition : CreateSpotGroup,
Properties : {
SecurityGroups : { Ref : backendSecurityGrpIds },
IamInstanceProfile : { Ref : BackendEncoders...},
ImageID : { Ref : ami },
InstanceType : { Ref : instanceType },
SpotPrice : { Ref : spotPrice },
InstanceMonitoring : false,
AssociatePublicIpAddress : true,
EbsOptimized : { Ref : ebsOptimized },
BlockDeviceMappings : [
{
DeviceName : xvdca
}
]
}
}
Bidding Strategy: Start Simple
Sealed-bid, second-price auction
Set your bid to market price
of an On-Demand Instance
$0.14
$0.24
$0.34
On-Demand
Instance Price: $0.84
The Challenge of Long-Running Jobs
The longer the job, the greater the
chance of instance revocation
Short window to determine how best
to failover (2 minutes)
Job Length
ChanceofInstanceRevocation
Managing Jobs in the Face of Instance Revocation
$
Market price
increase
Spot
“Spotter”
service
Wait until
T-30s Is Job
Done?
Yes
No Action
No
1. Save State
2. Kill Job
3. Reallocate
!
Scaling Up with Predictive Job Modeling
1. Number of waiting jobs
2. Number of jobs currently processing
3. When current jobs expected to finish
4. Incoming jobs in the last <interval>
5. Number of jobs expected to arrive
6. Time to spin up new machine
7. SLA by job
Inputs
More processing
capacity required?
Data
Scientists
?
Amazon CloudWatch Dashboards
Scaling Down
Active
Active
Hold
Hold
If the rate of incoming and in-process jobs is less than current processing capacity,
then we’re in a scale-down state.
Identify instances, not processing jobs. Then identify those within 15 minutes of a billing hour.
Active
Hold
Scale
Down
Scale
Down
Hold
Active
Active Hold
Scale
Down
Scale
Down
Active
But what if there’s a deficit of Spot capacity?
Operate two Auto Scaling groups for each backend worker pool
One for Spot ASG, one for on-demand ASG
When actual Spot capacity < desired capacity, offload to on-demand
Automatic Speech Recognition
Spot
On-Demand
Spot Futures at Panopto
Move to Spot Fleet
Ability to launch the most cost-efficient
instance type for any job
Lower prices with diversified resources
Ability to apply custom weighting (create
capacity units based on our app needs)
Challenge: no accounting for the cost of
EBS
Challenge: lacking ASG’s health checks
Challenge: lacking ASG’s tag propagation
From Immutable to Dynamic
Instance Configuration
Need to account for different processing capacity of different instance types
Will need to optimize number of workers being run in parallel on each VM
Substantial cost savings potential
Today: Immutable
Pro: Spin up instances quickly
Con: Could be more cost-efficient
Future: Dynamic
Choose the best Availability Zone,
instance type based on market price
Subdivide job
for grid processing
Future
Painful to cancel a 90% complete,
30 minute OCR indexing job
Today
Subdividing Jobs
Grid processing minimizes impact of Spot Instance loss
Also allows greater parallelization for faster user-visible time to task completion
In Summary
53%
Cost Reduction
Scenarios Spot has Unlocked for Panopto
Scale our inside-video search
technology across our entire
customer base.
Accelerate business growth. The
money saved with Spot is being
reinvested in expanding our team.
We’re hiring!
https://www.panopto.com/careers/
devjobs@panopto.com
Seattle, London, Pittsburgh
Thank you!
https://www.panopto.com/careers/
devjobs@panopto.com
Remember to complete
your evaluations!

More Related Content

What's hot

AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
Amazon Web Services
 

What's hot (20)

AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
AWS re:Invent 2016: Getting Started with Amazon Aurora (DAT203)
 
AWS re:Invent 2016: Advanced Tips for Amazon EC2 Networking and High Availabi...
AWS re:Invent 2016: Advanced Tips for Amazon EC2 Networking and High Availabi...AWS re:Invent 2016: Advanced Tips for Amazon EC2 Networking and High Availabi...
AWS re:Invent 2016: Advanced Tips for Amazon EC2 Networking and High Availabi...
 
AWS re:Invent 2016: Discovery Channel's Broadcast Workflows and Channel Origi...
AWS re:Invent 2016: Discovery Channel's Broadcast Workflows and Channel Origi...AWS re:Invent 2016: Discovery Channel's Broadcast Workflows and Channel Origi...
AWS re:Invent 2016: Discovery Channel's Broadcast Workflows and Channel Origi...
 
S/4HANA on AWS-SAPPHIRE NOW 2016
S/4HANA on AWS-SAPPHIRE NOW 2016S/4HANA on AWS-SAPPHIRE NOW 2016
S/4HANA on AWS-SAPPHIRE NOW 2016
 
Compliance in the Cloud Using Security by Design
Compliance in the Cloud Using Security by DesignCompliance in the Cloud Using Security by Design
Compliance in the Cloud Using Security by Design
 
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)
AWS re:Invent 2016: Disrupting Big Data with Cost-effective Compute (CMP302)
 
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
 
The Best of re:invent 2016
The Best of re:invent 2016The Best of re:invent 2016
The Best of re:invent 2016
 
AWS re:Invent 2016: Born in the Cloud; Built Like a Startup (ARC205)
AWS re:Invent 2016: Born in the Cloud; Built Like a Startup (ARC205)AWS re:Invent 2016: Born in the Cloud; Built Like a Startup (ARC205)
AWS re:Invent 2016: Born in the Cloud; Built Like a Startup (ARC205)
 
AWS re:Invent 2016: Case Study: How Atlassian Uses Amazon EFS with JIRA to Cu...
AWS re:Invent 2016: Case Study: How Atlassian Uses Amazon EFS with JIRA to Cu...AWS re:Invent 2016: Case Study: How Atlassian Uses Amazon EFS with JIRA to Cu...
AWS re:Invent 2016: Case Study: How Atlassian Uses Amazon EFS with JIRA to Cu...
 
AWS re:Invent 2016: Develop, Build, Deploy, and Manage Containerized Services...
AWS re:Invent 2016: Develop, Build, Deploy, and Manage Containerized Services...AWS re:Invent 2016: Develop, Build, Deploy, and Manage Containerized Services...
AWS re:Invent 2016: Develop, Build, Deploy, and Manage Containerized Services...
 
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)
AWS re:Invent 2016: Taking DevOps to the AWS Edge (CTD302)
 
ENT202 Creating Your Virtual Data Center: VPC Fundamentals and Connectivity O...
ENT202 Creating Your Virtual Data Center: VPC Fundamentals and Connectivity O...ENT202 Creating Your Virtual Data Center: VPC Fundamentals and Connectivity O...
ENT202 Creating Your Virtual Data Center: VPC Fundamentals and Connectivity O...
 
Security at Scale with AWS - AWS Summit Cape Town 2017
Security at Scale with AWS - AWS Summit Cape Town 2017 Security at Scale with AWS - AWS Summit Cape Town 2017
Security at Scale with AWS - AWS Summit Cape Town 2017
 
1. 利用微服務架構建立雲端影音平台 (Building Media Platform by Microservices Architecture)
1.	利用微服務架構建立雲端影音平台 (Building Media Platform by Microservices Architecture)1.	利用微服務架構建立雲端影音平台 (Building Media Platform by Microservices Architecture)
1. 利用微服務架構建立雲端影音平台 (Building Media Platform by Microservices Architecture)
 
Cost Optimization at Scale
Cost Optimization at ScaleCost Optimization at Scale
Cost Optimization at Scale
 
AWS re:Invent 2016: Automating and Scaling Infrastructure Administration with...
AWS re:Invent 2016: Automating and Scaling Infrastructure Administration with...AWS re:Invent 2016: Automating and Scaling Infrastructure Administration with...
AWS re:Invent 2016: Automating and Scaling Infrastructure Administration with...
 
AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
AWS re:Invent 2016: Effective Application Data Analytics for Modern Applicati...
 
AWS re:Invent 2016: Media Delivery from the Cloud: Integrated AWS Solutions f...
AWS re:Invent 2016: Media Delivery from the Cloud: Integrated AWS Solutions f...AWS re:Invent 2016: Media Delivery from the Cloud: Integrated AWS Solutions f...
AWS re:Invent 2016: Media Delivery from the Cloud: Integrated AWS Solutions f...
 
Containers and the Evolution of Computing
Containers and the Evolution of ComputingContainers and the Evolution of Computing
Containers and the Evolution of Computing
 

Viewers also liked

AWS December 2015 Webinar Series - Strategies to Quantify TCO & Optimize Cost...
AWS December 2015 Webinar Series - Strategies to Quantify TCO & Optimize Cost...AWS December 2015 Webinar Series - Strategies to Quantify TCO & Optimize Cost...
AWS December 2015 Webinar Series - Strategies to Quantify TCO & Optimize Cost...
Amazon Web Services
 
NYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / SolrNYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / Solr
thelabdude
 
Query Understanding at LinkedIn [Talk at Facebook]
Query Understanding at LinkedIn [Talk at Facebook]Query Understanding at LinkedIn [Talk at Facebook]
Query Understanding at LinkedIn [Talk at Facebook]
Abhimanyu Lad
 

Viewers also liked (20)

AWS re:Invent 2016: Save up to 90% and Run Production Workloads on Spot - Fea...
AWS re:Invent 2016: Save up to 90% and Run Production Workloads on Spot - Fea...AWS re:Invent 2016: Save up to 90% and Run Production Workloads on Spot - Fea...
AWS re:Invent 2016: Save up to 90% and Run Production Workloads on Spot - Fea...
 
AWS re:Invent 2016: Lessons Learned from a Year of Using Spot Fleet (CMP205)
AWS re:Invent 2016: Lessons Learned from a Year of Using Spot Fleet (CMP205)AWS re:Invent 2016: Lessons Learned from a Year of Using Spot Fleet (CMP205)
AWS re:Invent 2016: Lessons Learned from a Year of Using Spot Fleet (CMP205)
 
AWS December 2015 Webinar Series - Amazon Aurora: Introduction and Migration
AWS December 2015 Webinar Series - Amazon Aurora: Introduction and MigrationAWS December 2015 Webinar Series - Amazon Aurora: Introduction and Migration
AWS December 2015 Webinar Series - Amazon Aurora: Introduction and Migration
 
AWS December 2015 Webinar Series - Strategies to Quantify TCO & Optimize Cost...
AWS December 2015 Webinar Series - Strategies to Quantify TCO & Optimize Cost...AWS December 2015 Webinar Series - Strategies to Quantify TCO & Optimize Cost...
AWS December 2015 Webinar Series - Strategies to Quantify TCO & Optimize Cost...
 
AWS re:Invent 2016: Extending Hadoop and Spark to the AWS Cloud (GPST304)
AWS re:Invent 2016: Extending Hadoop and Spark to the AWS Cloud (GPST304)AWS re:Invent 2016: Extending Hadoop and Spark to the AWS Cloud (GPST304)
AWS re:Invent 2016: Extending Hadoop and Spark to the AWS Cloud (GPST304)
 
AWS re:Invent 2016: Life Without SSH: Immutable Infrastructure in Production ...
AWS re:Invent 2016: Life Without SSH: Immutable Infrastructure in Production ...AWS re:Invent 2016: Life Without SSH: Immutable Infrastructure in Production ...
AWS re:Invent 2016: Life Without SSH: Immutable Infrastructure in Production ...
 
AWS re:Invent 2016: Building a Solid Business Case for Cloud Migration (ENT308)
AWS re:Invent 2016: Building a Solid Business Case for Cloud Migration (ENT308)AWS re:Invent 2016: Building a Solid Business Case for Cloud Migration (ENT308)
AWS re:Invent 2016: Building a Solid Business Case for Cloud Migration (ENT308)
 
AWS re:Invent 2016: Elastic Load Balancing Deep Dive and Best Practices (NET403)
AWS re:Invent 2016: Elastic Load Balancing Deep Dive and Best Practices (NET403)AWS re:Invent 2016: Elastic Load Balancing Deep Dive and Best Practices (NET403)
AWS re:Invent 2016: Elastic Load Balancing Deep Dive and Best Practices (NET403)
 
Reading Metadata Between the Lines - Searching for Stories, People, Places an...
Reading Metadata Between the Lines - Searching for Stories, People, Places an...Reading Metadata Between the Lines - Searching for Stories, People, Places an...
Reading Metadata Between the Lines - Searching for Stories, People, Places an...
 
NYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / SolrNYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / Solr
 
Query Understanding at LinkedIn [Talk at Facebook]
Query Understanding at LinkedIn [Talk at Facebook]Query Understanding at LinkedIn [Talk at Facebook]
Query Understanding at LinkedIn [Talk at Facebook]
 
Search@airbnb
Search@airbnbSearch@airbnb
Search@airbnb
 
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, Cloudera
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, ClouderaParallel SQL and Analytics with Solr: Presented by Yonik Seeley, Cloudera
Parallel SQL and Analytics with Solr: Presented by Yonik Seeley, Cloudera
 
Downtown SF Lucene/Solr Meetup: Developing Scalable Search for User Generated...
Downtown SF Lucene/Solr Meetup: Developing Scalable Search for User Generated...Downtown SF Lucene/Solr Meetup: Developing Scalable Search for User Generated...
Downtown SF Lucene/Solr Meetup: Developing Scalable Search for User Generated...
 
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
 
Search at Twitter
Search at TwitterSearch at Twitter
Search at Twitter
 
Deep Dive into AWS ECS and Spot Instances at Scale
Deep Dive into AWS ECS and Spot Instances at ScaleDeep Dive into AWS ECS and Spot Instances at Scale
Deep Dive into AWS ECS and Spot Instances at Scale
 
Ektron 8.5 RC - Search
Ektron 8.5 RC - SearchEktron 8.5 RC - Search
Ektron 8.5 RC - Search
 
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, AirbnbAirbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
 
(MED301) Brazil's World Cup: Interacting with TV Viewers in Real-Time | AWS r...
(MED301) Brazil's World Cup: Interacting with TV Viewers in Real-Time | AWS r...(MED301) Brazil's World Cup: Interacting with TV Viewers in Real-Time | AWS r...
(MED301) Brazil's World Cup: Interacting with TV Viewers in Real-Time | AWS r...
 

Similar to AWS re:Invent 2016: Searching Inside Video at Petabyte Scale Using Spot (WIN307)

AWS Cloud School London Intro September 2014
AWS Cloud School London Intro September 2014AWS Cloud School London Intro September 2014
AWS Cloud School London Intro September 2014
Ian Massingham
 
AWSome Day Kuala Lumpur - Opening Keynote, Rick Harshman
AWSome Day Kuala Lumpur - Opening Keynote, Rick HarshmanAWSome Day Kuala Lumpur - Opening Keynote, Rick Harshman
AWSome Day Kuala Lumpur - Opening Keynote, Rick Harshman
Amazon Web Services
 
Aws what is cloud computing deck 08 14 13
Aws what is cloud computing deck 08 14 13Aws what is cloud computing deck 08 14 13
Aws what is cloud computing deck 08 14 13
Amazon Web Services
 

Similar to AWS re:Invent 2016: Searching Inside Video at Petabyte Scale Using Spot (WIN307) (20)

5 Years Of Building SaaS On AWS
5 Years Of Building SaaS On AWS5 Years Of Building SaaS On AWS
5 Years Of Building SaaS On AWS
 
Amazon EC2
Amazon EC2Amazon EC2
Amazon EC2
 
AWS Cloud School London Intro September 2014
AWS Cloud School London Intro September 2014AWS Cloud School London Intro September 2014
AWS Cloud School London Intro September 2014
 
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
 
Building Modern Applications on AWS.pptx
Building Modern Applications on AWS.pptxBuilding Modern Applications on AWS.pptx
Building Modern Applications on AWS.pptx
 
AWSome Day Kuala Lumpur - Opening Keynote, Rick Harshman
AWSome Day Kuala Lumpur - Opening Keynote, Rick HarshmanAWSome Day Kuala Lumpur - Opening Keynote, Rick Harshman
AWSome Day Kuala Lumpur - Opening Keynote, Rick Harshman
 
How to Reduce your Spend on AWS
How to Reduce your Spend on AWSHow to Reduce your Spend on AWS
How to Reduce your Spend on AWS
 
Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...
Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...
Get the Most Out of Amazon EC2: A Deep Dive on Reserved, On-Demand, and Spot ...
 
Getting started with Amazon Kinesis
Getting started with Amazon KinesisGetting started with Amazon Kinesis
Getting started with Amazon Kinesis
 
Getting started with amazon kinesis
Getting started with amazon kinesisGetting started with amazon kinesis
Getting started with amazon kinesis
 
AWS APAC Webinar Series: How to Reduce Your Spend on AWS
AWS APAC Webinar Series: How to Reduce Your Spend on AWSAWS APAC Webinar Series: How to Reduce Your Spend on AWS
AWS APAC Webinar Series: How to Reduce Your Spend on AWS
 
AWS Enterprise Day | Journey to the AWS Cloud
AWS Enterprise Day | Journey to the AWS CloudAWS Enterprise Day | Journey to the AWS Cloud
AWS Enterprise Day | Journey to the AWS Cloud
 
Hong Kong AWS Summit 2017 - Keynote
Hong Kong AWS Summit 2017 - KeynoteHong Kong AWS Summit 2017 - Keynote
Hong Kong AWS Summit 2017 - Keynote
 
SRV205 Architectures and Strategies for Building Modern Applications on AWS
 SRV205 Architectures and Strategies for Building Modern Applications on AWS SRV205 Architectures and Strategies for Building Modern Applications on AWS
SRV205 Architectures and Strategies for Building Modern Applications on AWS
 
AWS webinar what is cloud computing 13 09 11
AWS webinar what is cloud computing 13 09 11AWS webinar what is cloud computing 13 09 11
AWS webinar what is cloud computing 13 09 11
 
AWS Enterprise Summit - 클라우드 네이티브 신규 애플리케이션 구축하기 - 정윤진
AWS Enterprise Summit - 클라우드 네이티브 신규 애플리케이션 구축하기 - 정윤진AWS Enterprise Summit - 클라우드 네이티브 신규 애플리케이션 구축하기 - 정윤진
AWS Enterprise Summit - 클라우드 네이티브 신규 애플리케이션 구축하기 - 정윤진
 
Mateusz Zając - Continuous Integration i jej skalowalność w oparciu o TeamCit...
Mateusz Zając - Continuous Integration i jej skalowalność w oparciu o TeamCit...Mateusz Zając - Continuous Integration i jej skalowalność w oparciu o TeamCit...
Mateusz Zając - Continuous Integration i jej skalowalność w oparciu o TeamCit...
 
AWS Webcast - What is Cloud Computing with AWS
AWS Webcast - What is Cloud Computing with AWSAWS Webcast - What is Cloud Computing with AWS
AWS Webcast - What is Cloud Computing with AWS
 
What is Cloud Computing with AWS?
What is Cloud Computing with AWS?What is Cloud Computing with AWS?
What is Cloud Computing with AWS?
 
Aws what is cloud computing deck 08 14 13
Aws what is cloud computing deck 08 14 13Aws what is cloud computing deck 08 14 13
Aws what is cloud computing deck 08 14 13
 

More from Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

AWS re:Invent 2016: Searching Inside Video at Petabyte Scale Using Spot (WIN307)

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Tim Sullivan and Ari Bixhorn, Panopto December 2, 2016 Searching Inside Video at Petabyte-Scale using Spot
  • 2. What to Expect from the Session Primer on inside-video search Dive into how we use Spot to search video at scale Overview of our cross-platform architecture Best practices for scaling Spot Instances elastically
  • 4. Video: A Last-mile Problem for Search 30 trillion web pagesEmail and documentsFile system contentsVideo?
  • 5. 3 minutes, 53 seconds
  • 6. 15 - 90 minutes
  • 7. Title: An Introduction to Network Security Description: A broad overview of network security as defined by today’s hybrid corporate WANs. Tags: Network security, intrusion detection, corporate WAN, firewall, authentication !?
  • 8. 125 words per minute 5,625 words spoken
  • 9. The network is the entry point to your application. It provides the first gatekeepers that control access to the various servers in your environment. Servers are protected with their own operating system gatekeepers, but it is important not to allow them to be deluged with attacks from the network layer. It is equally important to ensure that network gatekeepers cannot be replaced or reconfigured by imposters. In a nutshell, network security involves protecting network devices and the data that they forward. The basic components of a network, which act as the front-line gatekeepers, are the router, the firewall, and the switch. An attacker looks for poorly configured network gatekeepers to exploit. Common vulnerabilities include weak default installation settings, wide-open access controls, and unpatched devices. 50%
  • 10. 5,625 words spoken 50% have no search value 2,813 words with search value With10 tags, you’ve only covered 0.3% of valuable content
  • 11.
  • 12. Six Types of Video Content Indexing 1. Manually entered metadata 2. Transcription 3. Automatic Speech Recognition (ASR) 4. Optical Character Recognition (OCR) 5. Slide extraction 6. Viewer notes
  • 13. Demo – Video Search
  • 14. What Led Us to Spot?
  • 15. Our Challenge 2013-01 2014-01 2015-01 2016-01 Running on AWS since 2009 Growing exponentially Need to index every video – quickly & cost-efficiently 15 years of video (400TB) content uploaded monthly Need to extract metadata out of 4PB of video 122M unique images have been indexed for OCR >3TB SOLR index * Numbers are inclusive of both enterprise and education accounts; numbers do not include on-premises customers
  • 16. Option 1: On-Demand Amazon EC2 Instances Hours of Content $ Budget Today Cost-prohibitive to offer to all customers Cost Enable ASR/OCR
  • 17. Content Ingestion Windows and Mac Clients Mobile Apps Video Capture Appliance Remote Capture Client Other Ingestion Content DiscoveryContent Management Content Delivery Content Consumption Transcoding Editing Search Indexing Governance Option 2: Make Search an Upsell Capability Analytics Access Control Video CMS Public Hosting SmartSearch™ Email and Social Integrations Search Federation Panopto Streaming CDN Integration P2P Streaming Panopto ECDN WAN Op Solutions Interactive Player Panopto Mobile Audio Podcast Embedded Player Quizzing and Polls
  • 18. Option 3: Use Reserved Instances (RIs) Theoretically would save costs RIs work best for predictable workloads 30 sec SLA to begin indexing results in spiky demand curve vs. flat line Upfront Monthly Effective Hourly Savings over On-Demand On-Demand Hourly $0 $213.16 $0.292 30% $0.42$1304 $75.92 $0.253 40% $2170 $0.00 $0.248 41% c3.2xlarge
  • 19. Option 3: Use Reserved Instances (RIs) RI Delayed Start WasteWaste # Instances t
  • 20. Option 3: Use Reserved Instances (RIs) RI Overspend Overspend Overspend Waste Waste # Instances t
  • 21. Option 4: Buy Our Own Hardware
  • 22. Option 5: Spot Instances Excess EC2 capacity auctioned at steeply discounted prices Spot Instances can be accessed on demand to meet our variable needs On-Demand Instances Spot Instances added when bid ≥ market
  • 23. Pre-configured or custom machine images Configure security and network access Choose from instance types and locations Use static IP endpoints Attach persistent block storage to instances Pay fixed price by the hour On-Demand vs. Spot Instances Pre-configured or custom machine images Configure security and network access Choose from instance types and locations Use static IP endpoints Attach persistent block storage to instances Pay variable by the hour
  • 25. The Spot Auction Set a bid price (for example, $0.27) Instance runs while bid ≥ market price Instances terminate bid < market price Instances run Instances terminate
  • 26. Spot Considerations Is your workload appropriate for potential volatility? How to deal with a lack of capacity? Can you run on a wide range of instance types (via Spot Fleet)? Look at historical bid prices for your instance types and regions to estimate your savings.
  • 28. The Importance of Windows to our Architecture Single codebase for cloud and on-premises For on-prem customers, Windows is often a requirement Windows is therefore critical to our cloud architecture as well On-Prem Cloud
  • 29. Panopto Cloud on AWS Distributed across Availability Zones
  • 30. Cross-Platform Implementation Web Servers App Servers Database Speech Recognition Apache SOLR
  • 31. Using Auto Scaling Groups Demand Running Instances
  • 32. Using AWS CloudFormation Define ASGs and auto-scale rules
  • 33. From On-Demand to Spot OnDemandLaunchConfig : { Type : AWS::AutoScaling::LaunchConfiguration Properties : { SecurityGroups : { Ref : backendSecurityGrpIds }, IamInstanceProfile : { Ref : BackendEncoders...}, ImageID : { Ref : ami }, InstanceType : { Ref : instanceType }, InstanceMonitoring : false, AssociatePublicIpAddress : true, EbsOptimized : { Ref : ebsOptimized }, BlockDeviceMappings : [ { DeviceName : xvdca } ] } } SpotLaunchConfig : { Type : AWS::AutoScaling::LaunchConfiguration Condition : CreateSpotGroup, Properties : { SecurityGroups : { Ref : backendSecurityGrpIds }, IamInstanceProfile : { Ref : BackendEncoders...}, ImageID : { Ref : ami }, InstanceType : { Ref : instanceType }, SpotPrice : { Ref : spotPrice }, InstanceMonitoring : false, AssociatePublicIpAddress : true, EbsOptimized : { Ref : ebsOptimized }, BlockDeviceMappings : [ { DeviceName : xvdca } ] } }
  • 34. Bidding Strategy: Start Simple Sealed-bid, second-price auction Set your bid to market price of an On-Demand Instance $0.14 $0.24 $0.34 On-Demand Instance Price: $0.84
  • 35. The Challenge of Long-Running Jobs The longer the job, the greater the chance of instance revocation Short window to determine how best to failover (2 minutes) Job Length ChanceofInstanceRevocation
  • 36. Managing Jobs in the Face of Instance Revocation $ Market price increase Spot “Spotter” service Wait until T-30s Is Job Done? Yes No Action No 1. Save State 2. Kill Job 3. Reallocate !
  • 37. Scaling Up with Predictive Job Modeling 1. Number of waiting jobs 2. Number of jobs currently processing 3. When current jobs expected to finish 4. Incoming jobs in the last <interval> 5. Number of jobs expected to arrive 6. Time to spin up new machine 7. SLA by job Inputs More processing capacity required? Data Scientists ?
  • 39. Scaling Down Active Active Hold Hold If the rate of incoming and in-process jobs is less than current processing capacity, then we’re in a scale-down state. Identify instances, not processing jobs. Then identify those within 15 minutes of a billing hour. Active Hold Scale Down Scale Down Hold Active Active Hold Scale Down Scale Down Active
  • 40. But what if there’s a deficit of Spot capacity? Operate two Auto Scaling groups for each backend worker pool One for Spot ASG, one for on-demand ASG When actual Spot capacity < desired capacity, offload to on-demand Automatic Speech Recognition Spot On-Demand
  • 41. Spot Futures at Panopto
  • 42. Move to Spot Fleet Ability to launch the most cost-efficient instance type for any job Lower prices with diversified resources Ability to apply custom weighting (create capacity units based on our app needs) Challenge: no accounting for the cost of EBS Challenge: lacking ASG’s health checks Challenge: lacking ASG’s tag propagation
  • 43. From Immutable to Dynamic Instance Configuration Need to account for different processing capacity of different instance types Will need to optimize number of workers being run in parallel on each VM Substantial cost savings potential Today: Immutable Pro: Spin up instances quickly Con: Could be more cost-efficient Future: Dynamic Choose the best Availability Zone, instance type based on market price
  • 44. Subdivide job for grid processing Future Painful to cancel a 90% complete, 30 minute OCR indexing job Today Subdividing Jobs Grid processing minimizes impact of Spot Instance loss Also allows greater parallelization for faster user-visible time to task completion
  • 47. Scenarios Spot has Unlocked for Panopto Scale our inside-video search technology across our entire customer base. Accelerate business growth. The money saved with Spot is being reinvested in expanding our team.