The Bigger They Are The Harder They Fall
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
758
On Slideshare
753
From Embeds
5
Number of Embeds
1

Actions

Shares
Downloads
11
Comments
0
Likes
0

Embeds 5

https://twitter.com 5

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Be Certain. Be Trillium Certain.The Bigger They Are TheHarder They Fall:Big Data & the Data QualityImperativeNigel Turner, VP Strategic Information ManagementTuesday 19th June 2012
  • 2. The bigger they are the harder they fall…
  • 3. But big can pay off…
  • 4. Big Data – what is it?Set of new concepts, practices & technologies to manage &exploit digital dataOVUM defines it as:“A data computational problem that is large and varied enough todemand new approaches to traditional SQL & related practices”Key premise is that all data has potential value if it can becollected, analysed and used to generate actionable insight
  • 5. Big Data – its characteristicsThe 3Vs• Reflects exponential growth of data – predicted 40-60% per annum• Today 2.5 quintillion bytes of data are created every day• 90% of all digital data was created in the last two years• Data generated more varied and complex than before:– Text, Audio, Images, Machine Generated etc.• Much of this data is semi-structured or unstructured• Traditional IT techniques ill equipped to process & analyse it• Data often generated in real time• Analysis and response needs to be rapid, often also real time• Traditional BI / DW environments becoming obsolescent – newapproaches are needed
  • 6. What’s different about Big Data?New technologies which enable distributed & highlyscalable MPP (Massively Parallel Processing), e.g.Apache HadoopMapReduceNoSQL databasesStrong emphasis on analytical approachesEmergence of “data science”Predictive AnalyticsData MiningThe “democratisation” of dataData made available to all (cf Cloud Computing)Business and not IT led BI
  • 7. Where does Big Data come from?Widely known sources
  • 8. Where does Big Data come from?Social Media & Social Networks
  • 9. Where does Big Data come from?Machine Generated data
  • 10. Big Data – some vertical applicationsRetail: using point of sale & social media data tosupplement & enrich traditional CRM / Marketing dataInsurance & Banking: fraud detectionHealth: holistic patient analysisUtilities: consumption peaks & troughs & capacityplanningTelcos: call routing optimisation & customer churnManufacturing: predictive fault identification & supplychain optimisationResearch: particle analysis, genomics etc.
  • 11. Big Data in practice - VolvoEvery Volvo vehicle has hundreds ofmicroprocessors / sensorsData generated used within the car itself butalso captured for analysis by Volvo and itsdealersAll data is loaded into a centralised dataanalysis hub & integrated with CRM,dealership & product dataUsed to optimise design & manufacturing,enhance customer interaction & improvesafety
  • 12. Big data in practice – fraud detection
  • 13. Big Data – why invest?Better understanding of customer & market behaviourImproved knowledge of product & service performanceAids innovation in products & servicesFact based and more rapid decision makingEnhances revenueReduces costsStimulates economic growth
  • 14. Big Data – the impact on individualsEmployeesEmpower & devolve decision makingCreate new job & upskilling opportunitiesConsumersBetter targeted offersImproved products & services that meet needs
  • 15. Big Data – the privacy concern
  • 16. Big Data – Foundations of SuccessIdentifying the right data to solve the business problem oropportunityThe ability to integrate & match varied data from multiple datasourcesstructured, semi-structured, unstructuredBuilding the right IT infrastructure to support Big DataapplicationsHaving the right capabilities & skills to exploit the data
  • 17. Big Data – the data integration challengeSOCIALMEDIASENSORSCSDATAEMAILMOBILESEXTERNALDATASOURCESINTERNALDATASOURCESCRMBILLINGOPSSALESPRODSANALYTICS PLATFORM 1ANALYTICS PLATFORM 2ANALYTICS PLATFORM 3ANALYTICS PLATFORM nACTIONABLE INSIGHT & KNOWLEDGE
  • 18. Big Data – Barriers & PitfallsThe sheer volume of data – what’s worth using?Data extraction challengesThe ability to match data from disparate sources / formats / mediaThe time taken to integrate new data sourcesThe risks of mismatching and incorrect identification of individualsLegal & regulatory pitfallsSecurity concerns – corporate & individualLack of skills & expertiseMaking the case for investment
  • 19. Big Data – the Data Quality Imperative (1)Need to profile external and internal data sourcesNeed to classify data to define what data really mattersNeed to assure the quality of internal (and some external)data sources for accuracy, completeness, consistencyNeed to define & apply business rules & metadatamanagement to how the data will be defined and usedNeed for a data governance framework to ensureconsistency & control
  • 20. Big Data – the Data Quality Imperative (2)Need processes & tools to enable:Source data profilingData integrationData parsingData standardisationBusiness rule creation & managementMetadata management & a shared business / IT glossaryData de-duplicationData normalisationData standardisationData matchingData enrichmentData auditMany of these functions must be capable of being carriedout in real time with zero lag
  • 21. Big Data – the key enablerEXTERNALDATASOURCESINTERNALDATASOURCESANALYTICS PLATFORM 1ANALYTICS PLATFORM 2ANALYTICS PLATFORM 3ANALYTICS PLATFORM nACTIONABLE INSIGHT & KNOWLEDGEPROFILEPARSESTANDARDISEMATCHENRICHDATA QUALITY PLATFORMPROFILEPARSESTANDARDISEMATCHENRICH
  • 22. Big Data – some algorithms1. BIG DATA + POOR DATA QUALITY = BIG PROBLEMS2. DATA DEMOCRITISATION – DATA GOVERNANCE =ANARCHY3. DATA MASH UPS – DATA QUALITY = DATA MESS4. BIG DATA ANALYTICS + POOR DQ = WRONG RESULTS5. BIG DATA – DATA ASSURANCE = JAIL6. 3V + DATA QUALITY = 4V (VALIDITY)
  • 23. Big Data – the futureTo date Big Data has been overhyped but now atipping point has comeIt is here and will grow in volume, velocity &varietyImmature concept & market so hard to plan – butconsolidation is happeningBig data in a business context reflects emerginggeneration’s expectations & needsData will increasingly be seen as an assetData skills will become increasingly valued
  • 24. Big Data – how Trillium Software can helpCurrent Trillium Software products & servicescan help you succeed in your Big Datajourney:Real time & batch data capabilities in:o Data profilingo Parsingo Standardisationo De-duplicationo Matchingo Enrichmento AuditStrategic consulting services to prepare for andrealise Big Data opportunities
  • 25. QuestionsContact: nigel.turner@trilliumsoftware.comwww.trilliumsoftware.com