Building Competitive Moats With Data

Co-Founder and CEO at SkipFlag
Oct. 1, 2014

More Related Content


Building Competitive Moats With Data

  1. Building Competitive Moats With Data Pete Skomoroch @peteskomoroch DataLead Oct 1, 2014 - Berkeley
  2. About Me • Ex Principal Data Scientist @ LinkedIn • Entrepreneur, Advisor at Data Collective
  3. Competitive Moats
  4. Data as Competitive Moat
  5. Why the current obsession with Big Data?
  6. The rise of Hadoop
  7. What is Big Data?
  8. Big Data: Myths
  9. Big Data: Reality • Science, theory, and reason are not being replaced • Big Data is different: for some problems, big data produces better results than we find with smaller samples • Data storage and logging are increasingly cheap, so err on the side of collecting data to process later if you think it may be valuable • Large, differentiated data assets are the foundation for defensible products and better decisions
  10. If software is eating the world…
  11. … it is replacing it with data
  12. Startups are moving offline life to online data • Restaurants => Yelp • Resume + Rolodex => LinkedIn • Powerpoint => SlideShare • Yearbook + Photos => Facebook • Real Estate => RedFin • Interior Design => Houzz
  13. The Data Factory Revolution Source: 2013 Steve Jennings/Getty Images Entertainment
  14. Early Data Factory:
  15. User Generated Data Moats
  16. User entered data has Gravity
  17. Behavioral history is a moat: life is easier when apps remember you
  18. Reputation based Data Moats
  19. Network Based Data Moats
  20. Don’t build on top of someone else’s moat
  21. Real scientists make their own data
  22. Build distinct, defensible datasets
  23. This sounds great, how do I build a data moat?
  24. A new occupation: data scientist
  25. What do data scientists actually do? source: data from
  26. Two species of data scientist* Type I: Traditional BI • Question-driven • Interactive • Ad-hoc, post-hoc • Fixed data • Focus on speed and flexibility • Output is embedded into a report, dashboard, or in-database scoring engine Type II: Data Products • Metric-driven • Automated • Systematic • Fluid data • Focus on transparency and reliability • Output is a production system that makes customer-facing decisions *Slide adapted from Josh Wills “From the Lab to the Factory”
  27. Data Products: automated systems that make customer facing decisions and collect data
  28. Data Product pre-history: Data Aggregators • 1972: Vinod Gupta forms American Business Information, Inc., a database initially built via manual data entry of Yellow Pages information • 1973: LEXIS full text legal search launches publicly • 1986: Bloomberg reaches 5,000 terminal subscribers • 1994: Jerry Yang & David Filo compile and maintain a hand curated set of categorized links on the World Wide Web known as the Yahoo! Directory
  29. The Rise of Algorithmic Data Products • Google: Web Search, PageRank, AdWords • Netflix: Movie Recommendations • Pandora: Music Recommendations • eBay: Product Search, Fraud Detection, Advertising • Amazon: Similar Items, Book Recommendations • LinkedIn: People You May Know, Who Viewed My Profile
  30. LinkedIn Skills: a moat built by data products
  31. Data Product investment and ROI • Skill Extraction and Standardization Pipeline • Skill Pages • Skills Section on member profiles • Suggested Skills Algorithm and email > 20M members • Skill Endorsements > 60M members, 3B+ Edges • Big product wins: engagement, recall, relevance • SkillRank & Reputation Algorithm R&D • LinkedIn is now the definitive source for information on skills & expertise *Statistics as of 2013
  32. How leaders can drive data growth • Accountability: Who defines the data vision & roadmap in your organization? Who is accountable for building and expanding your moat? • Invest in data infrastructure, training, logging, & tools for rapid iteration. Build a data lake. • Invest in exploration and innovation, including user facing data product and algorithm development • Define a framework for trading off data quality and quantity metrics • Ask “How does this increase our data moat?” when evaluating any new project, incentivize it
  33. Twitter: @peteskomoroch LinkedIn:

Editor's Notes

  1. Scientists make measurements: Creating new information, observations, alpha Some data scientists go to great lengths to avoid collecting data or touching the user interface, when a small change can eliminate tons of wasted time Requires authority or support from leadership to make product changes Works best if data scientists are involved in design decisions from the start - CERN supercollider - collect something nobody else has collected
  2. Vision/Roadmap: what data doesn’t exist that would make your product better, aligned with company mission. Google Streetview Photos => Self Driving Car
  3. Facebook / LinkedIn story – emergence of new role
  4. --- "Built to Last" Be a clock builder - an architect - not a time teller --- Another analogy: are you a sports reporter, repeating the details of the game in a dashboard, or are you crunching that data to select the best new talent
  6. Consumer Internet: productization of data + algorithms - eBay, Google, Amazon, Netflix, Pandora, Google Index size is a barrier now