0
Data Points += 2 billion
... daily
July 24th, 2013
2
ABOUT ME
Sean Knapp
(@seanknapp)
•Co-Founder, EVP & Chief Product Officer
(formerly CTO)
•Senior Software Engineer @ Goo...
33
Suite of products and services providing
white-label management, hosting, and
distribution of video online
Hundreds of ...
4
EVOLVING INSIGHTS
• Insights circa ’07
• How many videos did I show this week?
• What were my monthly uniques?
• Insight...
5
BIG DATA @ OOYALA
• 1st Gen (circa ’07)
• Process: Hadoop MapReduce
• Language: Ruby
• Store: MySQL
• 2nd Gen (circa ’09...
6
OUR GOALS
• Evolve our Analytics product from a time-delayed,
static reporting system to a realtime, granular, and
dynam...
7
OUR CHALLENGES
• Very small ops team supporting global infrastructure
• Not enough capacity for performance tuning
• Rou...
8
SELECTION PROCESS
• Key Criteria
• Scalability: PB+, 100k+ operations per second
• Cost / price-performance
• Availabili...
9
WHY CASSANDRA
• First learned about C* in Nov 2008
• First deployed C* in Sep 2009
• Compelling Features
• Scalability: ...
10
RESULTS
• Business
• Launched the next-gen of our Analytics in ’09 that
solidified Ooyala as the leader in our industry...
11
Q&A
THANK YOU
Upcoming SlideShare
Loading in...5
×

Webinar: 2 Billion Data Points Each Day

681

Published on

This webinar follows the process of evaluating different big data platforms based on varying use cases and business requirements, and explains how big data professionals can choose the right technology to transform their business. During this session, Ooyala CTO, Sean Knapp will discuss why Ooyala selected DataStax as the big data platform powering their business, and how they provide real-time video analytics that help media companies create deeply personalized viewing experiences for more than 1/4 of all Internet video viewers each month.

Published in: Technology, News & Politics
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
681
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
14
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • For first line re: executives - Global consumer and enterprise experience. After videos and analytics pings line - 1 billion plays per month and 2 billion analytics pings per day.
  • Transcript of "Webinar: 2 Billion Data Points Each Day"

    1. 1. Data Points += 2 billion ... daily July 24th, 2013
    2. 2. 2 ABOUT ME Sean Knapp (@seanknapp) •Co-Founder, EVP & Chief Product Officer (formerly CTO) •Senior Software Engineer @ Google •Built & launched iGoogle •Led Google’s Frontend Web Search and Ads UX teams, who drove a $1B increase in revenue for Google in 18 months •B.S. & M.S. in Computer Science from Stanford University
    3. 3. 33 Suite of products and services providing white-label management, hosting, and distribution of video online Hundreds of customers including ESPN, Bloomberg, Disney, Miramax, Univision, Dell, Pac-12 Networks, and more 100M+ unique users streaming more than 1B videos monthly, generating more than 2B analytics events daily 280 employees located in Silicon Valley, NYC, London, Tokyo, Sydney, Singapore, Seoul & Guadalajara OOYALA OVERVIEW
    4. 4. 4 EVOLVING INSIGHTS • Insights circa ’07 • How many videos did I show this week? • What were my monthly uniques? • Insights circa ’09 • How many ad impressions did I receive from users in each Designated Market Area (DMA)? • Insights circa ’11 • How many users do I have right now? • Insights circa ’13 • How does the revenue from iPad users age 25-34 compare to those on XBox? Weekly Instant Summary Detailed Complex
    5. 5. 5 BIG DATA @ OOYALA • 1st Gen (circa ’07) • Process: Hadoop MapReduce • Language: Ruby • Store: MySQL • 2nd Gen (circa ’09) • Process: Hadoop MapReduce • Language: Ruby • Store: Cassandra 0.5+ • 3rd Gen (circa ’11) • Process: MapReduce, Storm • Language: Ruby, Scala • Store: DataStax Enterprise (300TB disk, 1TB RAM) • 4th Gen (circa ’13) • Process: MapReduce, Storm, Spark, Hive • Language: Scala • Store: DataStax Enterprise (1.5PB disk, 14TB RAM) Batch Realtime Summary Granular Queryable
    6. 6. 6 OUR GOALS • Evolve our Analytics product from a time-delayed, static reporting system to a realtime, granular, and dynamic query engine • Launch our Content Recommendation engine, an entirely new product offering • Scale to billions of user events on a daily basis • Support an ever expanding set of global customers • Deliver a 5-9’s platform
    7. 7. 7 OUR CHALLENGES • Very small ops team supporting global infrastructure • Not enough capacity for performance tuning • Routinely fell behind the latest releases • Didn’t know which releases were stable enough • Unforeseen product requirements beyond the next 12 months • Existing solution would have cost nearly $1M to scale to just 100TB
    8. 8. 8 SELECTION PROCESS • Key Criteria • Scalability: PB+, 100k+ operations per second • Cost / price-performance • Availability: 5-9’s • Flexibility: schemaless • Alternative Technologies • Other RDMS systems • HBase • Voldemort
    9. 9. 9 WHY CASSANDRA • First learned about C* in Nov 2008 • First deployed C* in Sep 2009 • Compelling Features • Scalability: PB+, high ops/sec, billions of rows and columns • Performance: designed specifically for heavy workloads similar to Ooyala’s • Cost: could run on commodity hardware • Availability: multi-datacenter with no single point of failure • Community: strong, unified direction
    10. 10. 10 RESULTS • Business • Launched the next-gen of our Analytics in ’09 that solidified Ooyala as the leader in our industry • Launched our Content Recommendation engine in ’12 that again separated us from the industry • Technical • 1,000x the scale of just 5 years ago • Much higher ROI: 1PB+ for < $500k in hardware • No more 3am pager alerts
    11. 11. 11 Q&A
    12. 12. THANK YOU
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×