Big data, big deal?         February 2013         Matt Turck       Twitter: @mattturck    Blog: http://mattturck.com
Background: I prepared this slide deck for a couple of“Big Data 101” guest lectures I did in February 2012 atNew York Univ...
What does Target know about     pregnant women?
Hype    Data is…   "the new gold”   “the new black”   “the new plastic”   "the new oil”   “the new frontier”
Isn’t it what computers have always                done?
What’s different this time?         Volume.         Variety.         Velocity.
Facebook warehouses 180 petabytes          of data a year
Twitter manages 1.2 million deliveries            per second
New sources of data
Twitter manages 1.2 million deliveries            per second
Open Government Data
Big data is data that exceeds theprocessing capacity of conventionaldatabase systems. The data is toobig, moves too fast, ...
A new breed of technologies
Big Data Landscape                  Infrastructure                                         Analytics                      ...
A new breed of people:    Data scientists     engineering                                math                     nerds   ...
Sexy nerds?          “Data Scientist:The Sexiest Job of the 21st Century”           October 2012
Nerd talent shortage
Terms worth rememberingStructured vs. unstructured data            Hadoop        Cloud computing       Data visualization ...
So what do you do with all that        technology?
Lending
Trading
Insurance
Agriculture
Healthcare
Energy
Music
Education
But what about small data?
Moneyball is (relatively) small data
Nate Silver is (relatively) small data
Most companies only have small data
It’s not about big datafor the sake of big data
Data-driven management“In God we trust. Everyone else, bring data”
Data-driven culture
Easier than ever for any business to be           truly data-driven
Thanks!           Learn more:  NYC Data Business Meetupmeetup.com/NYC-Data-Business-Meetup/
Big Data, Big Deal? (A Big Data 101 presentation)
Upcoming SlideShare
Loading in...5
×

Big Data, Big Deal? (A Big Data 101 presentation)

21,772

Published on

Background: I prepared this slide deck for a couple of “Big Data 101” guest lectures I did in February 2013 at New York University’s Stern School of Business and at The New School. They’re intended for a college level, non technical audience, as a first exposure to Big Data and related concepts. I have re-used a number of stats, graphics, cartoons and other materials freely available on the internet. Thanks to the authors of those materials.

Published in: Technology
4 Comments
72 Likes
Statistics
Notes
No Downloads
Views
Total Views
21,772
On Slideshare
0
From Embeds
0
Number of Embeds
13
Actions
Shares
0
Downloads
0
Comments
4
Likes
72
Embeds 0
No embeds

No notes for slide
  • This is going to be a talk for people who love the internet.
  • The true story of bitly, engineering, data science, loveHow to do data science at scaleBuilding teams and keeping people happyClever tricks
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Asking questions.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Very different perspective, we have constrained resources, short time, and an expectation that what we do is relevant to the real world in some way.We build the system on this data, and then scale it for production use.
  • Big Data, Big Deal? (A Big Data 101 presentation)

    1. 1. Big data, big deal? February 2013 Matt Turck Twitter: @mattturck Blog: http://mattturck.com
    2. 2. Background: I prepared this slide deck for a couple of“Big Data 101” guest lectures I did in February 2012 atNew York University’s Stern School of Business and atThe New School. They’re intended for a collegelevel, non technical audience, as a first exposure to BigData and related concepts. I have re-used a number ofstats, graphics, cartoons and other materials freelyavailable on the internet. Thanks to the authors of thosematerials.
    3. 3. What does Target know about pregnant women?
    4. 4. Hype Data is… "the new gold” “the new black” “the new plastic” "the new oil” “the new frontier”
    5. 5. Isn’t it what computers have always done?
    6. 6. What’s different this time? Volume. Variety. Velocity.
    7. 7. Facebook warehouses 180 petabytes of data a year
    8. 8. Twitter manages 1.2 million deliveries per second
    9. 9. New sources of data
    10. 10. Twitter manages 1.2 million deliveries per second
    11. 11. Open Government Data
    12. 12. Big data is data that exceeds theprocessing capacity of conventionaldatabase systems. The data is toobig, moves too fast, or doesn’t fit thestrictures of your databasearchitectures. To gain value from thisdata, you must choose an alternativeway to process it. Edd Dumbill, O’Reilly
    13. 13. A new breed of technologies
    14. 14. Big Data Landscape Infrastructure Analytics Applications NoSQL Databases Hadoop Related Analytics Solutions Data Visualization Ad Optimization Publisher Marketing NewSQL Databases Statistical Computing Tools Social MediaMPP Databases Management / Cluster Services Industry Applications Monitoring Sentiment Analysis Analytics Services Security Application Service Providers Location / People / Big Data Search Events Storage IT Analytics Data SourcesCrowdsourcing Data Data Sources Collection / Real- Crowdsourced SMB Analytics Marketplaces Transport Time Analytics Cross Infrastructure / Analytics Personal Data Open Source Projects Framework Query / Data Data Access Coordination / Real - Statistical Machine Cloud Flow Workflow Time Tools Learning Deployment Matt Turck (@mattturck) and Shivon Zilis (@shivonz)
    15. 15. A new breed of people: Data scientists engineering math nerds nerds nerds nerdscomp sci hacking awesome nerds Credit: Hilary Mason, Bitly
    16. 16. Sexy nerds? “Data Scientist:The Sexiest Job of the 21st Century” October 2012
    17. 17. Nerd talent shortage
    18. 18. Terms worth rememberingStructured vs. unstructured data Hadoop Cloud computing Data visualization Machine learning Predictive analytics
    19. 19. So what do you do with all that technology?
    20. 20. Lending
    21. 21. Trading
    22. 22. Insurance
    23. 23. Agriculture
    24. 24. Healthcare
    25. 25. Energy
    26. 26. Music
    27. 27. Education
    28. 28. But what about small data?
    29. 29. Moneyball is (relatively) small data
    30. 30. Nate Silver is (relatively) small data
    31. 31. Most companies only have small data
    32. 32. It’s not about big datafor the sake of big data
    33. 33. Data-driven management“In God we trust. Everyone else, bring data”
    34. 34. Data-driven culture
    35. 35. Easier than ever for any business to be truly data-driven
    36. 36. Thanks! Learn more: NYC Data Business Meetupmeetup.com/NYC-Data-Business-Meetup/

    ×