Webinar: 2 Billion Data Points Each Day
Upcoming SlideShare
Loading in...5
×
 

Webinar: 2 Billion Data Points Each Day

on

  • 736 views

This webinar follows the process of evaluating different big data platforms based on varying use cases and business requirements, and explains how big data professionals can choose the right ...

This webinar follows the process of evaluating different big data platforms based on varying use cases and business requirements, and explains how big data professionals can choose the right technology to transform their business. During this session, Ooyala CTO, Sean Knapp will discuss why Ooyala selected DataStax as the big data platform powering their business, and how they provide real-time video analytics that help media companies create deeply personalized viewing experiences for more than 1/4 of all Internet video viewers each month.

Statistics

Views

Total Views
736
Views on SlideShare
736
Embed Views
0

Actions

Likes
1
Downloads
13
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • For first line re: executives - Global consumer and enterprise experience. After videos and analytics pings line - 1 billion plays per month and 2 billion analytics pings per day.

Webinar: 2 Billion Data Points Each Day Webinar: 2 Billion Data Points Each Day Presentation Transcript

  • Data Points += 2 billion ... daily July 24th, 2013
  • 2 ABOUT ME Sean Knapp (@seanknapp) •Co-Founder, EVP & Chief Product Officer (formerly CTO) •Senior Software Engineer @ Google •Built & launched iGoogle •Led Google’s Frontend Web Search and Ads UX teams, who drove a $1B increase in revenue for Google in 18 months •B.S. & M.S. in Computer Science from Stanford University
  • 33 Suite of products and services providing white-label management, hosting, and distribution of video online Hundreds of customers including ESPN, Bloomberg, Disney, Miramax, Univision, Dell, Pac-12 Networks, and more 100M+ unique users streaming more than 1B videos monthly, generating more than 2B analytics events daily 280 employees located in Silicon Valley, NYC, London, Tokyo, Sydney, Singapore, Seoul & Guadalajara OOYALA OVERVIEW
  • 4 EVOLVING INSIGHTS • Insights circa ’07 • How many videos did I show this week? • What were my monthly uniques? • Insights circa ’09 • How many ad impressions did I receive from users in each Designated Market Area (DMA)? • Insights circa ’11 • How many users do I have right now? • Insights circa ’13 • How does the revenue from iPad users age 25-34 compare to those on XBox? Weekly Instant Summary Detailed Complex
  • 5 BIG DATA @ OOYALA • 1st Gen (circa ’07) • Process: Hadoop MapReduce • Language: Ruby • Store: MySQL • 2nd Gen (circa ’09) • Process: Hadoop MapReduce • Language: Ruby • Store: Cassandra 0.5+ • 3rd Gen (circa ’11) • Process: MapReduce, Storm • Language: Ruby, Scala • Store: DataStax Enterprise (300TB disk, 1TB RAM) • 4th Gen (circa ’13) • Process: MapReduce, Storm, Spark, Hive • Language: Scala • Store: DataStax Enterprise (1.5PB disk, 14TB RAM) Batch Realtime Summary Granular Queryable
  • 6 OUR GOALS • Evolve our Analytics product from a time-delayed, static reporting system to a realtime, granular, and dynamic query engine • Launch our Content Recommendation engine, an entirely new product offering • Scale to billions of user events on a daily basis • Support an ever expanding set of global customers • Deliver a 5-9’s platform
  • 7 OUR CHALLENGES • Very small ops team supporting global infrastructure • Not enough capacity for performance tuning • Routinely fell behind the latest releases • Didn’t know which releases were stable enough • Unforeseen product requirements beyond the next 12 months • Existing solution would have cost nearly $1M to scale to just 100TB
  • 8 SELECTION PROCESS • Key Criteria • Scalability: PB+, 100k+ operations per second • Cost / price-performance • Availability: 5-9’s • Flexibility: schemaless • Alternative Technologies • Other RDMS systems • HBase • Voldemort
  • 9 WHY CASSANDRA • First learned about C* in Nov 2008 • First deployed C* in Sep 2009 • Compelling Features • Scalability: PB+, high ops/sec, billions of rows and columns • Performance: designed specifically for heavy workloads similar to Ooyala’s • Cost: could run on commodity hardware • Availability: multi-datacenter with no single point of failure • Community: strong, unified direction
  • 10 RESULTS • Business • Launched the next-gen of our Analytics in ’09 that solidified Ooyala as the leader in our industry • Launched our Content Recommendation engine in ’12 that again separated us from the industry • Technical • 1,000x the scale of just 5 years ago • Much higher ROI: 1PB+ for < $500k in hardware • No more 3am pager alerts
  • 11 Q&A
  • THANK YOU