• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
One Billion Rows per Second:  Analytics for the Digital Media Markets
 

One Billion Rows per Second: Analytics for the Digital Media Markets

on

  • 4,114 views

 

Statistics

Views

Total Views
4,114
Views on SlideShare
3,965
Embed Views
149

Actions

Likes
2
Downloads
29
Comments
0

9 Embeds 149

http://paper.li 62
http://a0.twimg.com 36
http://www.linkedin.com 18
http://us-w1.rockmelt.com 16
http://www.bridge-live.com 5
http://bridge-live.com 5
http://tweetedtimes.com 3
https://www.linkedin.com 3
http://www.mindflash.com 1
More...

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Across traditional desktop, mobile, and now gaming platforms, there are billions of advertising events occurring ever day. Many of these are priced and bought in real-time.Willie Sutton was once asked, why do you rob banks? That’s where the money is.For me, the reason I was enticed by this vertical is similar: that’s where the data is.
  • Strategic implications:
  • Practically speaking, we define this as:data freshness on the order of minutesbut queries over the data, made through our dashboard, return in secondsHadoop isn’t enough.
  • Practically speaking, we define this as:data freshness on the order of minutesbut queries over the data, made through our dashboard, return in seconds
  • Hadoop summarizes and precomputes a ton.
  • Hadoop summarizes and precomputes a ton.
  • Hadoop summarizes and precomputes a ton.
  • We all know that things taste better when they’re fresh.Data is no different.Jeff Jonas says, no value is knowing where the traffic was five minutes ago.
  • We all know that things taste better when they’re fresh.Data is no different.Jeff Jonas says, no value is knowing where the traffic was five minutes ago.
  • Dialogue with the data.Eliminate the chain of data bureaucrats and put the data in the hand of the decision maker.Get in the car & drive yourself.

One Billion Rows per Second:  Analytics for the Digital Media Markets One Billion Rows per Second: Analytics for the Digital Media Markets Presentation Transcript

  • One Billion Rows Per Second:
    Analytics for the Digital Media Markets
    STRATA SUMMIT NYC
    September 21, 2011
    MICHAEL DRISCOLL
    CO-FOUNDER & CTO
    @medriscoll
  • Taming the Inferno of the Online Ad Markets
    • billions of microtransactions per day
    • dozens of publisher, advertiser, & audience attributes
  • Goal: Fast Dashboards
    Over Big Data
  • Goal: Fast Dashboards
    Over Big Data
    dashboard
    queries in
    seconds
    database
    data
    crunched in minutes
    ingestion
  • Solution 1:
    Relational
    Database
    dashboard
    queries in
    minutes
    database
    MPP relational DB
    data
    crunched in minutes
    ingestion
    Hadoop
  • Solution 2:
    HBase
    dashboard
    queries
    in seconds
    database
    HBase
    data
    crunched
    in hours
    ingestion
    Hadoop
  • Solution 3:
    Do It Ourselves: Druid
    dashboard
    queries
    in seconds
    database
    Druid
    data
    crunched
    in minutes
    ingestion
    Hadoop
  • Four Principles of Druid’s Performance at Scale
    SUMMARIZE
    100x smaller
    vs raw data
    DISTRIBUTE
    100x throughput
    vs a single node
    PARALLELIZE
    100x faster
    vs reading disk
    STORE IN-MEMORY
    = 10^6
    Druid can filter and aggregate over 1 billion rows per second on a 50-core cluster,
    or 20m rows per core per second
    factor speed-up
  • Consequences of Speed: Data Freshness
    photo credit: Lars P. http://www.flickr.com/photos/lars_p/4911238308/sizes/o/in/photostream/
  • Consequences of Speed: Blue Sky Exploration
    photo credit: MonkeyAt Large http://www.flickr.com/photos/monkeyatlarge/16645379/sizes/l/in/photostream/
  • Consequences of Speed: Interactivity
    photo credit tonylanciabeta http://www.flickr.com/photos/tonysphotos/3305157904/sizes/o/in/photostream/
  • One Billion Rows Per Second:
    Analytics for the Digital Media Markets
    QUESTIONS? CONTACT ME AT MIKE@METAMARKETSGROUP.COM
    MICHAEL DRISCOLL
    CO-FOUNDER & CTO
    @medriscoll