Why now?<br />There were 5
exabytes of information created between the dawn of civilization through 2003, but that much information is now created every 2 days, and the pace is increasing<br />Eric Schmidt, Google CEO, Techonomy Conference, August 4, 2010<br />Data is becoming the new raw material of business: an economic input almost on a par with capital and labour. “Every day I wake up and ask, ‘how can I flow data better, manage data better, analyse data better?” says Rollin Ford, the CIO of Wal-Mart. <br />Source: Data, Data Everywhere, The Economist, February 25, 2010<br />
We’re going to talk about the BUSINESS of big data. We’ll briefly touch upon some of the key technology trends enabling Big Data businesses, but this discussion is really focused on three things:Note: this is in not meant to be the definitive *only* way to think about and conceptualize Big Data, but it is how *we* think about itAlso – we’ve prepared some material, but we’d love for this to be an interactive discussion and not a lecture.
Organizations everywhere now realize that there is immense insight and value locked inside of the data, and new infrastructure and approaches to data analysis allow us to unlock that value
This is what’s happened in the last four decades.
These four factors also happen to be inputs for data generation processes.
Sizes that were unimaginable a few years ago are now commonplaceJust storing and accessing the data can be difficultSIZE – MANAGED WITH – STOREDSmall :: Excel, R :: fits in memory on one machineMedium :: indexed files, monolithic DB :: fits on disk on one machineBig :: Hadoop, Distributed DB :: stored across many machinesGenerally - data too big to fit on a disk :: ‘data-center’ scale
Data that is difficult for computers to understand Principal example being natural langauagetext, Images, Video and moreValuable info locked up inside this data (e.g. twitter)
More data coming in fasterDecision windows getting smallerValuable to worthless in a matter of minutes. (seconds … no milliseconds)
IA Ventures funding companies that are powering this Big Data revolutionTools & Technologies for managing Big Data
Two primary axis to think about when categorizing Big Data companies
who’s data is it?Proprietary data you generate yourself?orDoing something interesting with someone else’s data?
2. What do you do with the data?Create a data product?Or use the data to inform product decisions for some non-data product
We’ll make this slightly more granular – Within data products there are companies that a) simply sell the data (almost like a direct pass through)orb)do some sort of computation on the data first and sell the abstracted data output
Now that we’ve talked about what a Big Data company is, we can begin to discuss the characteristics of what makes a Big Data company massively successful?The key to almost any massively successful company is creating some sort of sustainable competitive advantage / barrier to entry of competition.When competitive advantages exist, it is *very* difficult for new entrants to provide the same product or service that you provide. This means that your company will enjoy unique position in the market for years to come. By contrast, companies that do not have barriers to entry are susceptible to competitive threats by new entrants that simply out execute them.
There are others, but I don’t view them as strong:IP – an exogenous dynamic imposed via legal structure, not something intrinsic to the product or market dynamicSwitching costs – may be a barrier for existing customers, but has little impact on new customers
Network effect = a system where the value to the incremental customer is a direct function of the customers already in the systemNetworks effects are so powerful because they introduce a dynamic that tips towards a winner take all marketMsft, ebay, skype, facebook
data is put into some analytic framework, analyzed and returned. There is no sharing of insight, no making the system smarter
Data is put into some shared framework. Every node in the system benefits by being in the same system. The system gets smarterSome amount insight shared across nodes
Ex. Metamarkets – contributory data modelEx. Billguard – leverages flags across the network
Average unit cost declines as quantity produced increases
Best way to explain is via example (-> go to next slide)Key (with EoS and NE) is it starts with out-execution and then the competitive advantage dynamic takes hold and creates a Virtuous spiral
(Goback to previous slide to explain virtuous spiral)