• Save
Building Data Start-Ups:  Fast, Big, and Focused
Upcoming SlideShare
Loading in...5
×
 

Building Data Start-Ups: Fast, Big, and Focused

on

  • 15,124 views

====================================================== ...

======================================================
1. Building Data Start-ups: Fast, Big, and Focused
======================================================

* 2 parts today:

(i) forces behind big data opportunity
(ii) big data stack and how to compete with in

* building a data start-up is a bit like Sumo Wrestling

* data is heavy, has weight - we need agile strategies to succeed

* today: talk about opportunities for data, strategies for success

* in a nutshell: data start-ups must be fast, big, and focused


================================================
2. The Big Data Opportunity
================================================

* it's a cliche by now: there is a mountain of data in this world

* understanding these forces is critical to data start-up's strategy

<transition>: what are some of the tectonic forces at work?


================================================
3-4. Attack of the Exponentials
================================================

* these are something that i call 'attack of exponentials'

* VCs like curves like

[transition]

* in the past few decades, the cost of storage, CPU, and bandwidth has been exponentially dropping, while network access has shot up

* in 1980, a terabyte of storage cost $14 MILLION - today it's $47 dollars

<transition>: exponential economics, together with two other forces

================================================
5. Intersection of Three Forces
================================================

* ... form the inputs to this massive increase in data, the data singularity

* sensor networks the phones, GPS devices, laptops, and instrumented spimes

* cloud computing has democratized and made computing power & storage a utility

( "even if it turns out that the cloud is actually just some place in Virginia.")

================================================
6-7. Data Value Must Exceed Data Cost
================================================

* the laws of economics have not changed: value must exceed cost

* the upper left side of this graph shows data whose value exceeded
its cost of collecting, storing, and computing over a decade ago

* the human genome data cost $3 billion (in 2000)

[shift slide]

* but as the tide shifts, new classes of data are revealed as being valuable

* the dog genome cost only $30 million (in 2005)

* web log data used to be tossed; now it's cheap enough to collect,
store, and compute over

* i encourage all of you, think of a data source that was previously
not collected, or not kept around, and mull the possibilities

<transition>: with that, i would like to now talk about the emerging stack,
and the strategies for being successful within it

================================================
8-9, 10-11. Success on the Data Stack
================================================

* here is my vision of the emerging big data stack

* at bottom is data - persistence layer - databases - the brawn

* in the middle is analytics - the intelligence layer

* at the top - services, what you all the brains and brawn

[ transitions in quite succession ]

* I argue that data start-ups, to succeed, must have

== FAST data, BIG analytics, and FOCUSED services ==

* let's take each of these in turn,
exploring the competitive axes at each layer
starting from the bottom of the stack, data

================================================
12. FAST
================================================

* as I said before, data is heavy

* being able to move big data quickly is key

* let's pull the data layer out of the stack & examine it

================================================
13. Fast Data
================================================

* so we have the two competitive axes on the data layer

* the first axis is scale: for data, the scaling issue has been solved.

* Hadoop

Statistics

Views

Total Views
15,124
Views on SlideShare
14,363
Embed Views
761

Actions

Likes
45
Downloads
0
Comments
1

16 Embeds 761

http://www.marketingdistillery.com 377
http://igorsubbotin.blogspot.ru 260
http://benzipkin.com 54
http://www.linkedin.com 21
http://paper.li 8
http://iricelino.org 7
http://twitter.com 6
https://www.linkedin.com 6
http://tweetedtimes.com 5
https://igorsubbotin.blogspot.com 5
http://feedly.com 5
https://twitter.com 3
http://bzipkin.tumblr.com 1
http://digg.com 1
http://news.google.com 1
http://newsblur.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Thank you for sharing this presentation - I like that you went beyond saying what big data is and what it is used for, and talked about why we have big data and why it is useful.

    FYI, I cited this slide set in a presentation I prepared - an introduction to big data for marketers [available here: http://www.slideshare.net/acanhoto/cim-2012-big-data]
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • I want to first thank O’Reilly for putting together this event, and all of you for tuning in from around the globe.The Data Opportunity in 2 parts:I. The Opportunity: Why now, what forces are driving the data explosionII. The Technology Stack: What does the Big Data technology stack look like – where are the opportunities and risks?Data is heavy.

Building Data Start-Ups:  Fast, Big, and Focused Building Data Start-Ups: Fast, Big, and Focused Presentation Transcript

  • Building Data Start-ups:
    Fast, Big, and Focused
    Michael E. Driscoll, CTO, Metamarkets
    @medriscoll
    O’Reilly Strata Online | May 25, 2011
  • The Big Data
    Opportunity
  • The Attack of the Exponentials
  • The Attack of the Exponentials
  • The Intersection of Three Forces
    Yields Higher Volume & Velocity of Data
    exponential economics
    sensor networks
    cloud computing
  • Data Value Must Exceed Data Cost
  • Data Value Must Exceed Data Cost
    ... New Classes of Data are Now Valuable
  • Success on the Data Stack
    Services
    Analytics
    Data
  • Success on the Data Stack
    Fast
    Services
    Analytics
    Fast
    Data
  • Success on the Data Stack
    Fast, Big
    Services
    Big
    Analytics
    Fast
    Data
  • Success on the Data Stack
    Fast, Big, and Focused
    Focused
    Services
    Big
    Analytics
    Fast
    Data
  • #1: Fast
  • Success on the Data Stack
    Fast Data
    real-time
    Kdb
    Netezza
    Esper
    Vertica
    MongoDB
    speed
    InfoBright
    Aster
    MySQL
    MapR
    Greenplum
    Postgres
    batch
    Hadoop
    Services
    megabytes
    petabytes
    scale
    Analytics
    free, open-source
    Data
    commercial
  • Fast Data With Cheap Memory
    1964 – Univac 2k
    $51 million/MB
    2011 – DDR 1GB
    1 cent/MB
    data sources: http://www.sharkyextreme.com & http://www.webservicessummit.com/Trends/TechTrends1/img11.html, plotted with ggplot2
  • #2: Big
  • Success on the Data Stack
    Big Analytics
    custom
    (hardware)
    real-time
    speed
    Revolution R
    R
    custom
    distributed
    SAP
    SAS
    SciPy
    SPSS
    batch
    Services
    megabytes
    petabytes
    scale
    Analytics
    free, open-source
    Data
    commercial
  • The Promise ofAnalytics
    extract
    learn
    predict
    DATA
    FEATURES
    MODELS
    “More data usually beats better algorithms.”
  • #3: Focused
  • Success on the Data Stack
    Focused Services
    Focused
    Services
    Analytics
    Data
  • “Real-time, large-scale analytics in a focused vertical.”
    credit: Joe Reisinger, Metamarkets
  • Success on the Data Stack
    Fast, Big, and Focused
    Focused
    Services
    Big
    Analytics
    Fast
    Data
  • Thank You. Questions?
    Michael E. Driscoll, CTO, Metamarkets
    @medriscoll
    O’Reilly Strata Online | May 25, 2011