Generating Big Value from Big Data


Published on

Talking about Big Data generates a lot of questions; however, most of the focus is on the technologies and skills required to collect and store this volume of information as opposed to the insight that companies need to derive from it. What factors should organizations consider in order to ensure that they are capitalizing on their investments with these technologies? How do you break through business silos to enable sharing of data to increase organizational value? Leveraging his cross-industry experience at companies like The Walt Disney Company, Travelers Insurance and Demand Media, Brendan Aldrich will discuss the question of “big value” with industry examples and a particular focus on his current work to deploy a “data democracy” within the City Colleges of Chicago.

Session Discovery Topics:
• Big value - keeping an eye on the forest (assumptions, judgment and bias)
• Data democracy - increasing productivity with data transparency and open access

Published in: Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • PygmalionWritten in 1912 by George Bernard ShawLondon Premiere in 1914 at Sir Herbert Beerbohm Tree’s His Majesty’s TheatreAdapted as “My Fair Lady” in 1956
  • 120 million people in the U.S. now own Smartphones72 hours of video are added to YouTube every minute230 million tweets per day30+ billion pieces of data added to facebook every month9,000 job search results for data scientists in 2012
  • Reasons Cited for Education: Lack of data-driven mindset and available dataCluster A: Computer and electronic products and information sectorsCluster B: Finance and Insurance and GovernmentCluster C: Construction, Arts, Education & Other: Negative productivity growth indicating strong systemic barriers to increasing productivityCluster D: Manufacturing, Transportation, Wholesale and Professional servicesCluster E: Local services
  • “A data democracy is not about making ‘all data available for everyone’. It’s about making sure that each person has access to the data, metadata, analytic tools and reports they need to best fulfill their own role and responsibilities.”
  • Generating Big Value from Big Data

    1. 1. Chicago Big Data Executive Summit June 12, 2013 Big Value from Generating Big Data
    2. 2. Using Data to Derive Value • Lessons Learned: – Data size is relative to an organizations ability to make use of it – Assumptions and bias can get in the way – The best insights are actionable
    3. 3. R. Brendan Aldrich Executive Director, Data Warehousing City Colleges of Chicago • 18 years in Information Technology • 13 years running data warehouse, business intelligence and analytics teams for global high volume data companies such as The Walt Disney Company, Travelers Insurance and Demand Media • Currently building a data democracy at the City Colleges of Chicago • TDWI and AERA membership Speaker Introduction
    4. 4. • Colleges: – Richard J. Daley College – Kennedy-King College – Malcolm X College – Olive-Harvey College – Harry S Truman College – Harold Washington College – Wilbur Wright College • Satellites: – Lakeview Learning Center – Dawson Technical Institute – West Side Learning Center – South Chicago Learning Center – Arturo Velasquez Institute – Humboldt Park Vocational Education Center • Culinary – The French Pastry School – Washburn Culinary Institute • Parot Cage Restaurant • Sikia Banquet Room • Broadcast – WYCC TV (Channel 20) – WKKC FM 89.9 …as well as five child development centers, the Center for Distance Learning and the Workforce Institute The City Colleges of Chicago is the largest community college district in the state of Illinois and one of the largest in the country with more than 5,800 administrators, staff and faculty educating over 120,000 students annually at facilities located within the city of Chicago.
    5. 5. The Origin of Big Data John Mashey, chief scientist at Silicon Graphics until 2000, gave hundreds of talks to small groups in the mid-to-late 1990’s using the term “Big Data” to describe how the boundaries of computing keep advancing. 1
    6. 6. Gartner Group 2001: Doug Laney first uses “Volume, Velocity & Variety” to describe Big Data 2 2012: Gartner updates the definition to: “Big data are high volume, high velocity and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process automation”
    7. 7. Datafication is Driving Big Data Datafication: Creating new data that didn’t previously exist in digital form The more you know about your customer, the better you can differentiate yourself from your competitors.
    8. 8. Disney’s Magic Bands 3 • Customer Value: – Disney’s MagicBands will allow park guests to access the park, sign up for ride waitlists (FastPass), interact with characters, purchase items, lost parents, etc. • Company Value: – What type of guest are you and how do you route through the park (rides, concessions, shows, purchases, etc.) – Route optimization, scheduling, ride balancing – Know your customer • Worldwide – 121.4 million guests (2011) • Florida – 17.1 million guests (2011)
    9. 9. Getting to Big Value (or… Don’t Miss the Trees for the Forest) 1. Gathering vs. Understanding 2. Assumptions 3. Bias
    10. 10. Barrier #1: Gathering vs. Understanding “Big Data is not defined by it’s data management challenges, but by the organization’s capabilities in analyzing the data, deriving intelligence from it, and leveraging it to make forward looking decisions.” 4 - Issac Sacolick, VP Technology at McGraw-Hill Construction
    11. 11. Yes… The Volumes are Big
    12. 12. The “Understanding” Market Takes Off
    13. 13. Value Derived from Human Interaction “Data and data sets are not objective; they are creations of human design. We give numbers their voice, draw inferences from them, and define their meaning through our interpretations.” 5 - Kate Crawford, Principal Researcher @ Microsoft Research
    14. 14. What Does Your Data Weigh? • Light Data – Easily quantifiable measures and facts • Mid-Weight Data – Interesting data; trends; patterns • Heavy Data – Rich, meaningful, verified, and actionable data Data classification on the value being derived from the data
    15. 15. Barrier #2: Assumptions People inherently make assumptions… which can lead you to find what you expect as opposed to the marketable anomalies
    16. 16. • DVD rental and video streaming company with over – 33 million subscribers (29 million streaming) in 40 countries • Big Data Stats: – More than 50 Cassandra clusters with over 750 nodes – More than 50,000 reads & 100,000 writes per second. • Claims 75% of its subscribers are influenced by what it suggests they will like. 6
    17. 17. House of Cards • Netflix’s data indicated that the same subscribers who loved the original BBC production of “House of Cards” also loved movies starring Kevin Spacey or directed by David Fincher. 7 • Netflix has committed $100 million to create two 13-episode seasons.
    18. 18. Were they Right? • From a data standpoint, it’s hard to know since Netflix doesn’t release viewership numbers. • But how else could we evaluate? – Facebook likes: 206k – Twitter: 34,706 Followers – Mainstream Culture • Magazine Covers? • Talk shows? • What do you hear? • What could we conclude?
    19. 19. Barrier #3: BIAS “Hidden biases in both the collection and analysis stages present considerable risks, and are as important to the big-data equation as the numbers themselves.” 5 - Kate Crawford, Principal Researcher @ Microsoft Research
    20. 20. Classification of Bias8 • Cognitive – Misunderstanding of the probabilities. • Selection – Most available, convenient and/or cost-effective as opposed to most relevant. • Sampling – Most relevant to a subset that may not hold true in the wider population. • Modeling – Biased assumptions drive selection of wrong variables • Funding – Assumptions, interpretations, data and applications skewed to favor funding party • Representation – Larger data sets do not ensure that the data is representative.
    21. 21. Accounting for Bias9 • Know your Enemy – Be aware of biases that may affect your analysis. Document them as part of your results • Make use of Subject Matter Experts – Validate your results with domain experts and use them to test your findings and algorithms • Continuous Exploration – Don’t settle for satisfactory! Investigate the anomalies and explore the data outside of your focus
    22. 22. Generating Big Value • Big Data is quantitative • Deriving meaningful insights requires people • Managing assumptions and bias increases value • Insights identified can be acted upon • Insights acted upon must be continually reviewed Anything Else?
    23. 23. Rise of the Data Democracy “Humans are not an important part of utilizing new data, they are single most important part of the process.” 10 - Bryce Maddock, CEO of
    24. 24. McKinsey: Systemic Barriers for Education 11
    25. 25. Building a Data Democracy: Enable Everyone with Access • The right data must be available in all areas of the organization. • Access to and use of data will create positive and lasting change. • All City Colleges of Chicago employees will be able to use this platform to obtain data and/or run reports. Only part of this challenge is licensing cost! Organizational acceptance, tool selection, bandwidth, data comprehension and accessible training are critical!
    26. 26. Building a Data Democracy: Breaking Down the Silos
    27. 27. Building a Data Democracy: One-Size Does Not Fit All … and Interactive Analytics for all users Reports …User-Created Dashboards A unified data warehouse and web-based interface for accessing and interacting with data
    28. 28. Building a Data Democracy: Increase Data Comprehension & Skills Integrated Data Dictionary and Online Training By integrating necessary reference and training information directly into the analytics website, we enable our employees to know with certainty what their data means and how to use it effectively.
    29. 29. Takeaways • Generating Big Value from Big Data: – Datafication is driving differentiation in the marketplace • Collect the data that drives your business – The value in Big Data is derived from human insight • How much does your data weigh? – Be aware of Assumptions and Bias in your approach • Evaluate what does and doesn’t benefit your analysis – Enable everyone with the right data to succeed • Data democracy
    30. 30. APPENDIX
    31. 31. References • Infographics IBM Big Data Hub, Infographic, “Tuning Into Big Data As The Buzz Gets Louder”, 9/26/12, louder Mushroom Networks, Infographic, “Landscape of Big Data”, 2013, Graeme Noseworthy, Infographic, “The Flood of Big Data”, 4/24/12, digital-marketing/ 4 Issac Sacolick, Blog, “What is Big Data The Real Challenges Beyond Volume, Velocity and Variety”, 12/11/12, beyond.html 7 Mary McNamara, Los Angeles Times, “Netflix’s ‘House of Cards’ looks, but doesn’t sound, like a hit””, 4/27/13, st-house-of-cards-netflix-20130427 6 Andrew Leonard, Salon, “How Netflix is turning viewers into puppets”, 2/1/13, _puppets/ • Articles 5 Kate Crawford,Blog, Harvard Business Review, “The Hidden Biases in Big Data”, 4/1/13, 8 James Kobielus,IBM Big Data Hub, “Data Scientist: Bias, Backlash and Brutal Self- Criticism”, 5/16/13, brutal-self-criticism 9 Haowen Chan and Robin Morris, GigaOm, “Careful: Your big data analytics may be polluted by data scientist bias”, 5/4/13, analytics-may-be-polluted-by-data-scientist-bias/ 10 James Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh and Angela Hung Byers, McKinsey Global Institute, “Big data: The next frontier for innovation, competition and productivity”, 5/11, e_next_frontier_for_innovation 3 Jules Polonetsky, Linkedin Post, “Magic Lessons for Retailers”, 5/31/13, magic-lessons-for-retailers 11 Bryce Maddock, Blog, “People and Big Data: Separately Good, Together Great”, 9/26/12, 1 Steve Lohr, The New York Times, “The Origins of ‘Big Data’: An Etymological Detective Story”, detective-story/ 2 Doug Laney, Blog, “Deja VVVu: Others Claiming Garner’s Construct for Big Data”, 1/14/12, volume-velocity-variety-construct-for-big-data/