EDF2013: Invited Talk Daragh O'Brien: The Story of Maturity – How data in Business needs to pass the ‘So What’ tests

  • 548 views
Uploaded on

Invited talk of Daragh O'Brien, Managing Director of Castlebridge Associates, at the European Data Forum 2013, 9 April 2013 in Dublin, Ireland: The Story of Maturity – How data in Business needs to …

Invited talk of Daragh O'Brien, Managing Director of Castlebridge Associates, at the European Data Forum 2013, 9 April 2013 in Dublin, Ireland: The Story of Maturity – How data in Business needs to pass the ‘So What’ tests

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
548
On Slideshare
0
From Embeds
0
Number of Embeds
6

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • The history of all great hype cycles
  • Tom gives the example of his early work in telecoms billing data. The emphasis was on the sample bias quality but the actual measurement error in the process – the data quality issues – where an order of magnitude greater than the errors due to the sample bias.

Transcript

  • 1. EUROPEAN DATA FORUMFrom Near to Maturity – Making Big Data relevant to Business © 2013 Castlebridge Associates
  • 2. HISTORYOr: How we came to have all this data anyway…
  • 3. Ancient Sumeria• Written in Accadian• Used pictographic representations of information and concepts baked/carved into tablets made of clay (high sand content)
  • 4. Filing: The Birth of Big Data Image by Nic McPhee @ commons.wikimedia.com
  • 5. Physical Data (5925 years approx.) 6 thousand yearsTablets Tablets Electronic Data (c.75 years) • More Information processed • Information processed faster • More ‘self service’ data processing • Changed expectations of data and processing.
  • 6. But the BIG QUESTION is: SO WHAT??
  • 7. Particularly as we may be too late! • Barry Devlin, • “Big Data is Dead. It‟s all just Data!!” • (B-EyeNetwork, December 2012) • Samuel Arbesman (Wired.com) • “Stop Hyping Big Data and Start Paying Attention to „Long Data‟” • (Wired.com – January 2013) • Ted Friedman (Gartner) on Twitter:Image © Barry Devlin/B-EYENetwork
  • 8. Is Big Data just a matter of perspective?
  • 9. MATURITY
  • 10. Where is Big Data? Certainty Wisdom Optimising Enlightenment Managed Awakening Defined Repeatable Uncertainty Initial (Overlaying Crosby CMM model with DMBOK Maturity model)
  • 11. Where is Big Data? Certainty Wisdom Optimising Enlightenment Managed Awakening Defined Repeatable Uncertainty Initial
  • 12. Maturity: Answering So What QuestionsSo What… …is it? …problems will it solve? …will we be able to differently? … legal / regulatory risks does all this pose? … do we need to do to tap this gold mine? … are we not doing today that this will enable? … are we not doing today that this make worse?
  • 13. THE CHALLENGES
  • 14. Organisations don‟t manage data well Information Governance / Data Governance only now emerging as formal disciplines Information Quality / Data Quality also only beginning to be coherently tackled in many organisations Phone companies still get bills wrong Data Protection breaches still occur • Note – this is more than just SECURITY breaches Data Migrations, CRM, ERP still fail Metadata largely under-managed
  • 15. Bottom Line Impact % of Risk Managers who see Information asDeloitte 88% “Significant” in their Risk Management plans % Data Migrations that FAIL (don‟t deliver, over 84% Bloor run time/budget, deliver reduced functionality)% of Chief Financial Officers who see InformationForresterManagement as a barrier to achieving Business goals 75%Estimated % of TURNOVER wasted by Gartner 35%companies due to poor information quality Time lost to organisations from staff 30% IBM rechecking information This is when dealing with “traditional” structured/semi-structured data..
  • 16. Strategy Goals/Objectives/Issues/Opportunities (Why) Culture & Environment
  • 17. “So far, for 50 years, the information revolution has centered ondata—their collection, storage, transmission, analysis, andpresentation. It has centered on the "T" in IT.The next information revolution asks, what is the MEANING ofinformation, and what is its PURPOSE?” Peter Drucker, Forbes ASAP, August 1998
  • 18. After the Hype Comes the Hangover
  • 19. Data Is the New Oil Oil Slick Water Pic: US Coast Guard Picture from NASA
  • 20. A REAL EXAMPLENames have been changed to protect the innocent(and the guilty)
  • 21. The Pending Order Crisis of 2006 If order not completed, cannot be billed
  • 22. The Pending Order Crisis of 2006OMG There‟s MILLIONS of unbilled revenue out This is a CRISIS!!! there.
  • 23. The Pending Order Crisis of 2006 The Sky is FALLING
  • 24. The Pending Orders Solution 2006 Elite Specialist Information Quality Agent Licensed to “Fix the Data by all means necessary” (firearms not actually used…)
  • 25. The Pending Orders Solution 2006 Orders for could have Orders for infrastructure multiple dependent had engineering statuses products – double counted Revenue Assurance did not Dependencies between look at all relevant data process steps not sources understood
  • 26. The Pending Order Solution 2006There wasn‟t a Crisis situation • External Factors affected order completion times • Intra-order product dependencies lead toRevenue double counting • Context of the process wasAssurance importantHypothesis wasflawed
  • 27. ASKING THE RIGHT QUESTIONS
  • 28. One way of thinking about data
  • 29. Question 1: So What Data Do We Need? No doubt that more data helps, but don‟t for a minute think that you need all data to make an informed business decision. Organizations that are effectively leveraging the power of Big Data realize that they will never capture all relevant information. Phil Simon To Big To Ignore: The Business Case for Big Data
  • 30. Question 1: So What Data Do We Need?Chicken Little © 2005 Disney Corporation
  • 31. Question 1: So What Data Do We Need?What is the problem we are trying to solve?What is the Process Context for this problem?What is the “Information Environment” for this problem?
  • 32. The Pending Orders CrisisWhat is the problem we are trying to solve? • Customers are not being billed for services they have • Revenue from services is not being realised • We have orders that are not being completedWhat is the Process Context for this problem?What is the “Information Environment” for this problem?
  • 33. Question 1: So What Data Do We Need? To properly answer this question you need to have: A PLAN
  • 34. Question 2: So What is Stopping us doing it? • Data Protection Rules Regulation: • Industry Regulations re: Data Governance • Legacy architecture Technology: • Technology Management (Silos)Human Factors: • Skills (technical/problem solving/analytical • Political (Change Management)
  • 35. Question 2: So What is Stopping us doing it? • Quality of internal data Data: • Completeness, consistency, “transactability” • Ability to link external data to internal data • Governance of data • Decision rights • Supplier relationship management • Roles & Responsibilities
  • 36. Example of RegulationLocation DataUse of Location Data in Telecommunications is affected by EU Data Protection rules Consent is required for it to be used for “Value Adding” services
  • 37. Data Quality I am incredibly sceptical about claims that “Big Data” is immune to Data Quality problems. Statistically, Data Quality errors will skew your mean, and create outliers that affect your analysis. While “Big Data” might not be as prone to „fat finger‟ errors, you still have to consider whether the mechanisms gathering the data are correctly calibrated and the algorithms for analysis are running correctly or whether you have measurement errors you don‟t know about. Dr Thomas C Redman, thought leader in Data Quality
  • 38. Data Quality & Lineage are Key
  • 39. Databases are like lakesSystem A System B System C
  • 40. Bias within the Data?The greatest number of tweets about Sandy came fromManhattan. This makes sense given the citys high level ofsmartphone ownership and Twitter use, but it creates theillusion that Manhattan was the hub of the disaster. Veryfew messages originated from more severely affectedlocations, such as Breezy Point, Coney Island andRockaway. As extended power blackouts drained batteriesand limited cellular access, even fewer tweets came fromthe worst hit areas. Kate Crawford Hidden Biases in Big Data, HBR 1st April 2013
  • 41. Human Factors• Bias• Politics• Skills• “Attachment Disorder”• Change & Transition Management
  • 42. Strategy Goals/Objectives/Issues/Opportunities (Why) Culture & Environment