SlideShare a Scribd company logo
1 of 58
Download to read offline
OPTION 2
BIG	
  DATA
John McKeever,
Data Migrators
Martin Spratt,
Clear Strategic IT
1Wednesday, 20 August 14
Big Data
Personalisation, Prediction and Prevention
John McKeever
2Wednesday, 20 August 14
Housekeeping
Please... keep your phones ON and go to:
pollev.com/jmck
Feel free to put them on ‘silent’ though :-)
pollev.com/jmck
From	
  any	
  browser
3Wednesday, 20 August 14
Agenda
• What is Big Data?
‣ Where did it come from? Why now?
• What opportunities does it present?
‣ Personalisation, Prediction, Prevention
• How do I get started?
• NOTE: This is an opinion piece (It’s not a science!)
4Wednesday, 20 August 14
What is Big Data?
5Wednesday, 20 August 14
Poll
What is Big Data?
6Wednesday, 20 August 14
Answer
“Big Data is any
thing which is
crash Excel”
DevOps Borat
7Wednesday, 20 August 14
Poll
Worldwide, how many people
have access to a mobile device?
8Wednesday, 20 August 14
Why Big Data?
• 7 billion people access 6 billion mobile devices
• Last year we...
‣ Sent 11 billion texts
‣ Watched 2.8 billionYouTube videos
‣ Performed 5 billion Google searches
• The world’s data doubles every 2.1 years
9Wednesday, 20 August 14
Where is it coming from?
• Increased device accessibility
• New storage paradigms
• New transaction types
• Growth in social media
• Increase in use of rich media
• More conversations
10Wednesday, 20 August 14
11Wednesday, 20 August 14
The Internet Of Things
• Internet-enabled everything
• Objects predict their own failure
‣ ... and wirelessly notify their manufacturers
‣ ... who automatically pre-order parts
• Objects upgrade themselves
‣ ... such as this Mac
• Objects communicate with one another
‣ Energy companies will control demand
12Wednesday, 20 August 14
Wearables
• “The Quantified Self”
• Smartphone proxies
• Location-aware devices
• Fitness trackers
• Augmented reality
• Cloud publishing
13Wednesday, 20 August 14
Why Invent ‘Big Data’?
• Big Data is about more than just ‘lots of data’
‣ ... although that’s part of it
• Big Data typically characterised by ‘3Vs’:
Volume
Variety
Velocity
14Wednesday, 20 August 14
Volume
• Typically measured in Petabytes
‣ A gigabyte is 7 minutes of HD video
‣ A terabyte is 120 hours of HD video (1024 Gb)
‣ A petabyte is 14 years of HD video (1024 Tb)
• Accelerating rate of growth - driven largely by mobile devices
• Prices dropping dramatically
15Wednesday, 20 August 14
Storage is Cheap
• Storage costs are reducing
exponentially
• Data expands to fill the space
available
• Heading fast towards the online
‘Personal Petabyte’ 0.00001
0.00010
0.00100
0.01000
0.10000
1.00000
10.00000
100.00000
1,000.00000
10,000.00000
1980 1989 1997 2006 2014
Source: PC magazine, Byte magazine, newegg.com
Storage Costs ($US/Mb)
16Wednesday, 20 August 14
Variety
• Traditional databases are designed for well-structured data
• Making sense of free-form text?
• Extracting information from audio?
• Searching video?
• New relationship structures between data
‣ Increasing use of network modelling
17Wednesday, 20 August 14
Network Modelling
• Sentiment is viral!
• Uncovers relationships of varying types
and strengths
• What are the distinct groups within your
customer networks?
• Who are the most and least connected?
• Who are the ‘influencing nodes‘ in your
customer networks?
18Wednesday, 20 August 14
19Wednesday, 20 August 14
Why Look At Networks?
• Unhappy customers vent frustrations on social networks
• Those using Twitter are already disproportionately upset
‣ Compared to those raising traditional complaints
• Twitter complaint response: 3 minutes ➔ 70 minutes
• Email complains: 24 hours (30%) ➔ Never (70%)
‣ Almost ¾ of organisations are ignoring their customers!
20Wednesday, 20 August 14
Viral Complaints
• The Dave Carroll band were flying with United Airlines whose
handlers damaged his guitars
• Complaints were met with rudeness, avoidance and red tape
• HisYouTube response video received 13 million hits
• Negative sentiment flooded social networks
• United Airlines’ stock dropped 10% ($180 million)
21Wednesday, 20 August 14
Velocity
• Driven by proliferation of mobile devices
• Twitter processes over 34,000 tweets every 60 seconds
• Amazon process approximately 20 million transactions a day
• The SKA, due for completion in 2024, will generate...
‣ 1,376 petabytes per day
‣ Twice the current daily global internet traffic!
22Wednesday, 20 August 14
What’s Big Data?
• ‘Traditional’ data processing technology isn’t designed for Big Data
‣ Ask Facebook, Google,Twitter, eBay, Amazon,Walmart, ...
• Big Data could be thought of as an organisational toolkit:
‣ Application of new technologies to handle the 3V’s
‣ Application of advanced statistical tools to our data
‣ Adaptation of business processes to leverage new insight
23Wednesday, 20 August 14
What Opportunities Does
Big Data present?
24Wednesday, 20 August 14
Q: Which one of these guys won the 2008 U.S. Presidential race?
Poll
Eric BarackDan
25Wednesday, 20 August 14
Poll
Eric
Schmidt
(Google)
Barack
Obama
(President)
Dan
Wagner
(Geek)
26Wednesday, 20 August 14
Predicting Politics
• Nate Silver
• Big Data Scientist who started by
predicting baseball results
• Famous for predicting 2008 US
election results with 98% accuracy
• Did it again 2012 with 100%
accuracy (predicted Obama 91%)
27Wednesday, 20 August 14
Predicting Crime
• “PredPol” predictive policing
initiative
• Los Angeles Police Department
and the University of California
• Software predicts where crime
will occur within a given area
• Based on analysis of 13 million
crimes over the last 80 years
28Wednesday, 20 August 14
Predicting Crime
• Mathematical model originally
determined the location of
earthquake aftershocks
• Crime prediction model is
updated with new crime data in
real time to improve accuracy
• Result: 12% decrease in property
crime, 28% decrease in burglary
29Wednesday, 20 August 14
NSW Police 3rd Eye Cameras
• Sydney police getting vest-
mounted cameras
• The Wolfcom units record
- 6 hours of HD video
- 20K 12 megapixel images
- 500 hrs voice recording
- All GPS tagged
• How is this used?
30Wednesday, 20 August 14
Target
• Minneapolis father furious at ‘offensive’ marketing to his daughter
• ... due to Andrew Pole, Big Data specialist at Target
• Andrew identified about 25 products that, together, allowed him to
assign each online user a ‘pregnancy prediction’ score
• ... which can also estimate the due date to within a few days!
• Target uses this to send coupons timed to very specific stages of
pregnancy
31Wednesday, 20 August 14
Big Data Approaches
Holistic data
Unstructured data
Correlation
32Wednesday, 20 August 14
Holistic Data
• Traditional approaches used data sampling due to data volumes
‣ Take every n’th record
‣ Take selected records (e.g. geographical or other segments)
• Sampling is often biased
‣ Statistical aberrations
‣ Simpson’s paradox
33Wednesday, 20 August 14
Unstructured Data
• Incorporate unstructured data into your analysis
‣ Twitter, Facebook, Social Media
‣ Emails, Contact notes
‣ Audio, Pictures,Videos
‣ Networks
• Distill these and use them to feed analytic models
34Wednesday, 20 August 14
Correlation over Causation
• Traditional analysis involves testing hypotheses against our data
‣ Requiring a hypothesis
‣ Root cause analysis based on guessing reasons for behaviour
• Holistic data opens the door to a new approach
‣ Focus on influencing factors rather than possible causes
‣ Root cause analysis based on statistical probability
35Wednesday, 20 August 14
Correlation =/= Causation
4.00
4.25
4.50
4.75
5.00
2000 2002 2004 2006 2008
3.00
4.50
6.00
7.50
9.00
Divorcesperthousand
Divorce ?
36Wednesday, 20 August 14
Poll
The yellow line is the divorce rate in Maine, US.
What’s the correlating blue value?
37Wednesday, 20 August 14
Correlation =/= Causation
4.00
4.25
4.50
4.75
5.00
2000 2002 2004 2006 2008
3.00
4.50
6.00
7.50
9.00
Divorcesperthousand
Divorce ?
38Wednesday, 20 August 14
Correlation =/= Causation
4.00
4.25
4.50
4.75
5.00
2000 2002 2004 2006 2008
3.00
4.50
6.00
7.50
9.00
Divorce rate Margarine sales
Divorcesperthousand
Pounds
39Wednesday, 20 August 14
Big Data for Complaints
• Anticipate complaints
‣ Based on statistical probability and our customer insight
• Identify the root cause of complaints
‣ Link complaints to business processes and organisational change
• Proactively engage customers in high quality conversations
‣ So poor conversations don’t escalate into complaints
40Wednesday, 20 August 14
The Opportunities
• Derive insight from customer behaviour
• Analytic probabilities rather than traditional signals
‣ Correlation over causation
• Augment our data with 3rd party intelligence
• Derive insight from non-traditional (unstructured) sources
41Wednesday, 20 August 14
How Do I Get Started
With Big Data?
42Wednesday, 20 August 14
Statistically Probable Starting Point
• Big Data is not a panacea!
• Big Data will not fix your data quality issues
‣ Customer insight requires a single customer!
• Start by assessing your current information architecture
• Data Integration
43Wednesday, 20 August 14
Disparate Data
CRM
Complaints
Financial
Provisioning
Data
Warehouse
44Wednesday, 20 August 14
Disparate Data
CRM
Complaints
Financial
Provisioning
Data
Warehouse
• Disconnected Data
• Repeated Data
• Data Quality
• Data stewardship
• Business Glossary
• How do I answer questions?
45Wednesday, 20 August 14
Integrated Data
CRM
Complaints
Financial
Provisioning
Warehouse
3rd Party Unstructured Whatever...
46Wednesday, 20 August 14
Getting Started
• Focus on...
‣ Data Integration
‣ Data Quality
‣ Master Data Management
• Big Data can help these initiatives
‣ ...but you need to reach a minimum threshold before you start
47Wednesday, 20 August 14
Big Customer Understanding
Customer
Enrichment
Integration Customer Analytics Conversations
Feedback
48Wednesday, 20 August 14
What can Big Data do for me?
• You don’t need a SKA or 1.23 billion users (like Facebook) to
benefit from the approaches adopted by the Big Data organisations
• The Big Data toolkit incorporates...
• Technology adoption
• Statistical modelling
• Business change
49Wednesday, 20 August 14
What can Big Data do for me?
• Understand your customers
• Customer segmentation to a segment of one - The Customer
• Anticipate their needs
• ... and hence their behaviour
• Drive high quality conversations with them
• Based on your understanding of them
50Wednesday, 20 August 14
Big Data Technologies
• Business intelligence
• Visualisation
• Infrastructure
• Agile methodologies
• New data storage
architectures
• Parallel processing
• Machine learning
• Statistical modelling
51Wednesday, 20 August 14
Approaches Opportunities
Big Data Summary
Characteristics
• Velocity • Incorporating
unstructured data
into your analysis
• Holistic data rather
than sampling
• Correlation rather
than causation
• Complaints root
cause analysis
• Volume
• Improve the quality
of conversations
• Anticipate
behaviour through
deep understanding
• Variety
52Wednesday, 20 August 14
Privacy - Social Media
• Legislation will always trail technology
• Social Media sites most frequently have “a worldwide, non-
exclusive, royalty-free license, with the right to sublicense”
• You’ve never paid Facebook or Twitter a cent
• They can ‘monetize’ both your content and your metadata
• Most legislation centres around self-policing and opt-out
53Wednesday, 20 August 14
Privacy - Individuals’ Rights
• Staples (US) operate a punitive pricing model
‣ Your IP address tells them if you live in an
expensive neighbourhood
• OfficeMax addressed marketing to
”Mike Seay, Daughter Killed in Car Crash”
• Washington D.C. Police office convicted after
looking up licence plates of vehicles near a
gay bar and blackmailing the vehicle’s owners
54Wednesday, 20 August 14
Privacy - I can buy your...
• Full name, spouse, children, ex-partners, co-habitees, current
address, previous addresses, ownership status, purchase date and
price, outstanding mortgage debt
• Job type, income band, credit score, credit and store cards, spending
habits, charitable contributions, family events (births, deaths) and
likely political affiliation
• Ethnicity, primary language, and (in the US) health information!
‣ Cancer, diabetes and clinical depression lists with credit score
55Wednesday, 20 August 14
56Wednesday, 20 August 14
Connect With Me
au.linkedin.com/in/jhmckeever
@datamigrators
john.mckeever@datamigrators.com
57Wednesday, 20 August 14
Questions...
http://pollev.com/jmck
58Wednesday, 20 August 14

More Related Content

Similar to Personalization, Prediction and Prevention with Big Data

Big Data and Intellectual Property
Big Data and Intellectual PropertyBig Data and Intellectual Property
Big Data and Intellectual PropertyJoren De Wachter
 
Big Data — Your new best friend
Big Data — Your new best friendBig Data — Your new best friend
Big Data — Your new best friendReuven Lerner
 
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria? Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria? INACAP
 
Big data v4.0
Big data v4.0Big data v4.0
Big data v4.0Ian Brown
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementationSandip Tipayle Patil
 
Big Data Trends - WorldFuture 2015 Conference
Big Data Trends - WorldFuture 2015 ConferenceBig Data Trends - WorldFuture 2015 Conference
Big Data Trends - WorldFuture 2015 ConferenceDavid Feinleib
 
FBIC Global Deborah Weinswig New Tech Presentation Dec. 3 2014
FBIC Global Deborah Weinswig New Tech Presentation Dec. 3 2014FBIC Global Deborah Weinswig New Tech Presentation Dec. 3 2014
FBIC Global Deborah Weinswig New Tech Presentation Dec. 3 2014Deborah Weinswig
 
DataEd Online: Demystifying Big Data
DataEd Online: Demystifying Big DataDataEd Online: Demystifying Big Data
DataEd Online: Demystifying Big DataDATAVERSITY
 
Data-Ed: Demystifying Big Data
Data-Ed: Demystifying Big DataData-Ed: Demystifying Big Data
Data-Ed: Demystifying Big DataData Blueprint
 
Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...InnoTech
 
Big data 2017 final
Big data 2017   finalBig data 2017   final
Big data 2017 finalAmjid Ali
 
Big Data & the importance of Data Science
Big Data & the importance of Data ScienceBig Data & the importance of Data Science
Big Data & the importance of Data ScienceWim Van Leuven
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalIIIT Allahabad
 
Social Media World presentation
Social Media World presentationSocial Media World presentation
Social Media World presentationkperi
 
Introduction to Big Data and Data Science
Introduction to Big Data and Data ScienceIntroduction to Big Data and Data Science
Introduction to Big Data and Data ScienceFeyzi R. Bagirov
 

Similar to Personalization, Prediction and Prevention with Big Data (20)

Big Data
Big DataBig Data
Big Data
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
Big Data and Intellectual Property
Big Data and Intellectual PropertyBig Data and Intellectual Property
Big Data and Intellectual Property
 
Big data
Big dataBig data
Big data
 
Big Data — Your new best friend
Big Data — Your new best friendBig Data — Your new best friend
Big Data — Your new best friend
 
What Developers Should Do With Data
What Developers Should Do With DataWhat Developers Should Do With Data
What Developers Should Do With Data
 
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria? Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
 
Big data v4.0
Big data v4.0Big data v4.0
Big data v4.0
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
Big Data Trends - WorldFuture 2015 Conference
Big Data Trends - WorldFuture 2015 ConferenceBig Data Trends - WorldFuture 2015 Conference
Big Data Trends - WorldFuture 2015 Conference
 
Big Data and You
Big Data and YouBig Data and You
Big Data and You
 
FBIC Global Deborah Weinswig New Tech Presentation Dec. 3 2014
FBIC Global Deborah Weinswig New Tech Presentation Dec. 3 2014FBIC Global Deborah Weinswig New Tech Presentation Dec. 3 2014
FBIC Global Deborah Weinswig New Tech Presentation Dec. 3 2014
 
DataEd Online: Demystifying Big Data
DataEd Online: Demystifying Big DataDataEd Online: Demystifying Big Data
DataEd Online: Demystifying Big Data
 
Data-Ed: Demystifying Big Data
Data-Ed: Demystifying Big DataData-Ed: Demystifying Big Data
Data-Ed: Demystifying Big Data
 
Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...
 
Big data 2017 final
Big data 2017   finalBig data 2017   final
Big data 2017 final
 
Big Data & the importance of Data Science
Big Data & the importance of Data ScienceBig Data & the importance of Data Science
Big Data & the importance of Data Science
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
Social Media World presentation
Social Media World presentationSocial Media World presentation
Social Media World presentation
 
Introduction to Big Data and Data Science
Introduction to Big Data and Data ScienceIntroduction to Big Data and Data Science
Introduction to Big Data and Data Science
 

Recently uploaded

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 

Recently uploaded (20)

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 

Personalization, Prediction and Prevention with Big Data

  • 1. OPTION 2 BIG  DATA John McKeever, Data Migrators Martin Spratt, Clear Strategic IT 1Wednesday, 20 August 14
  • 2. Big Data Personalisation, Prediction and Prevention John McKeever 2Wednesday, 20 August 14
  • 3. Housekeeping Please... keep your phones ON and go to: pollev.com/jmck Feel free to put them on ‘silent’ though :-) pollev.com/jmck From  any  browser 3Wednesday, 20 August 14
  • 4. Agenda • What is Big Data? ‣ Where did it come from? Why now? • What opportunities does it present? ‣ Personalisation, Prediction, Prevention • How do I get started? • NOTE: This is an opinion piece (It’s not a science!) 4Wednesday, 20 August 14
  • 5. What is Big Data? 5Wednesday, 20 August 14
  • 6. Poll What is Big Data? 6Wednesday, 20 August 14
  • 7. Answer “Big Data is any thing which is crash Excel” DevOps Borat 7Wednesday, 20 August 14
  • 8. Poll Worldwide, how many people have access to a mobile device? 8Wednesday, 20 August 14
  • 9. Why Big Data? • 7 billion people access 6 billion mobile devices • Last year we... ‣ Sent 11 billion texts ‣ Watched 2.8 billionYouTube videos ‣ Performed 5 billion Google searches • The world’s data doubles every 2.1 years 9Wednesday, 20 August 14
  • 10. Where is it coming from? • Increased device accessibility • New storage paradigms • New transaction types • Growth in social media • Increase in use of rich media • More conversations 10Wednesday, 20 August 14
  • 12. The Internet Of Things • Internet-enabled everything • Objects predict their own failure ‣ ... and wirelessly notify their manufacturers ‣ ... who automatically pre-order parts • Objects upgrade themselves ‣ ... such as this Mac • Objects communicate with one another ‣ Energy companies will control demand 12Wednesday, 20 August 14
  • 13. Wearables • “The Quantified Self” • Smartphone proxies • Location-aware devices • Fitness trackers • Augmented reality • Cloud publishing 13Wednesday, 20 August 14
  • 14. Why Invent ‘Big Data’? • Big Data is about more than just ‘lots of data’ ‣ ... although that’s part of it • Big Data typically characterised by ‘3Vs’: Volume Variety Velocity 14Wednesday, 20 August 14
  • 15. Volume • Typically measured in Petabytes ‣ A gigabyte is 7 minutes of HD video ‣ A terabyte is 120 hours of HD video (1024 Gb) ‣ A petabyte is 14 years of HD video (1024 Tb) • Accelerating rate of growth - driven largely by mobile devices • Prices dropping dramatically 15Wednesday, 20 August 14
  • 16. Storage is Cheap • Storage costs are reducing exponentially • Data expands to fill the space available • Heading fast towards the online ‘Personal Petabyte’ 0.00001 0.00010 0.00100 0.01000 0.10000 1.00000 10.00000 100.00000 1,000.00000 10,000.00000 1980 1989 1997 2006 2014 Source: PC magazine, Byte magazine, newegg.com Storage Costs ($US/Mb) 16Wednesday, 20 August 14
  • 17. Variety • Traditional databases are designed for well-structured data • Making sense of free-form text? • Extracting information from audio? • Searching video? • New relationship structures between data ‣ Increasing use of network modelling 17Wednesday, 20 August 14
  • 18. Network Modelling • Sentiment is viral! • Uncovers relationships of varying types and strengths • What are the distinct groups within your customer networks? • Who are the most and least connected? • Who are the ‘influencing nodes‘ in your customer networks? 18Wednesday, 20 August 14
  • 20. Why Look At Networks? • Unhappy customers vent frustrations on social networks • Those using Twitter are already disproportionately upset ‣ Compared to those raising traditional complaints • Twitter complaint response: 3 minutes ➔ 70 minutes • Email complains: 24 hours (30%) ➔ Never (70%) ‣ Almost ¾ of organisations are ignoring their customers! 20Wednesday, 20 August 14
  • 21. Viral Complaints • The Dave Carroll band were flying with United Airlines whose handlers damaged his guitars • Complaints were met with rudeness, avoidance and red tape • HisYouTube response video received 13 million hits • Negative sentiment flooded social networks • United Airlines’ stock dropped 10% ($180 million) 21Wednesday, 20 August 14
  • 22. Velocity • Driven by proliferation of mobile devices • Twitter processes over 34,000 tweets every 60 seconds • Amazon process approximately 20 million transactions a day • The SKA, due for completion in 2024, will generate... ‣ 1,376 petabytes per day ‣ Twice the current daily global internet traffic! 22Wednesday, 20 August 14
  • 23. What’s Big Data? • ‘Traditional’ data processing technology isn’t designed for Big Data ‣ Ask Facebook, Google,Twitter, eBay, Amazon,Walmart, ... • Big Data could be thought of as an organisational toolkit: ‣ Application of new technologies to handle the 3V’s ‣ Application of advanced statistical tools to our data ‣ Adaptation of business processes to leverage new insight 23Wednesday, 20 August 14
  • 24. What Opportunities Does Big Data present? 24Wednesday, 20 August 14
  • 25. Q: Which one of these guys won the 2008 U.S. Presidential race? Poll Eric BarackDan 25Wednesday, 20 August 14
  • 27. Predicting Politics • Nate Silver • Big Data Scientist who started by predicting baseball results • Famous for predicting 2008 US election results with 98% accuracy • Did it again 2012 with 100% accuracy (predicted Obama 91%) 27Wednesday, 20 August 14
  • 28. Predicting Crime • “PredPol” predictive policing initiative • Los Angeles Police Department and the University of California • Software predicts where crime will occur within a given area • Based on analysis of 13 million crimes over the last 80 years 28Wednesday, 20 August 14
  • 29. Predicting Crime • Mathematical model originally determined the location of earthquake aftershocks • Crime prediction model is updated with new crime data in real time to improve accuracy • Result: 12% decrease in property crime, 28% decrease in burglary 29Wednesday, 20 August 14
  • 30. NSW Police 3rd Eye Cameras • Sydney police getting vest- mounted cameras • The Wolfcom units record - 6 hours of HD video - 20K 12 megapixel images - 500 hrs voice recording - All GPS tagged • How is this used? 30Wednesday, 20 August 14
  • 31. Target • Minneapolis father furious at ‘offensive’ marketing to his daughter • ... due to Andrew Pole, Big Data specialist at Target • Andrew identified about 25 products that, together, allowed him to assign each online user a ‘pregnancy prediction’ score • ... which can also estimate the due date to within a few days! • Target uses this to send coupons timed to very specific stages of pregnancy 31Wednesday, 20 August 14
  • 32. Big Data Approaches Holistic data Unstructured data Correlation 32Wednesday, 20 August 14
  • 33. Holistic Data • Traditional approaches used data sampling due to data volumes ‣ Take every n’th record ‣ Take selected records (e.g. geographical or other segments) • Sampling is often biased ‣ Statistical aberrations ‣ Simpson’s paradox 33Wednesday, 20 August 14
  • 34. Unstructured Data • Incorporate unstructured data into your analysis ‣ Twitter, Facebook, Social Media ‣ Emails, Contact notes ‣ Audio, Pictures,Videos ‣ Networks • Distill these and use them to feed analytic models 34Wednesday, 20 August 14
  • 35. Correlation over Causation • Traditional analysis involves testing hypotheses against our data ‣ Requiring a hypothesis ‣ Root cause analysis based on guessing reasons for behaviour • Holistic data opens the door to a new approach ‣ Focus on influencing factors rather than possible causes ‣ Root cause analysis based on statistical probability 35Wednesday, 20 August 14
  • 36. Correlation =/= Causation 4.00 4.25 4.50 4.75 5.00 2000 2002 2004 2006 2008 3.00 4.50 6.00 7.50 9.00 Divorcesperthousand Divorce ? 36Wednesday, 20 August 14
  • 37. Poll The yellow line is the divorce rate in Maine, US. What’s the correlating blue value? 37Wednesday, 20 August 14
  • 38. Correlation =/= Causation 4.00 4.25 4.50 4.75 5.00 2000 2002 2004 2006 2008 3.00 4.50 6.00 7.50 9.00 Divorcesperthousand Divorce ? 38Wednesday, 20 August 14
  • 39. Correlation =/= Causation 4.00 4.25 4.50 4.75 5.00 2000 2002 2004 2006 2008 3.00 4.50 6.00 7.50 9.00 Divorce rate Margarine sales Divorcesperthousand Pounds 39Wednesday, 20 August 14
  • 40. Big Data for Complaints • Anticipate complaints ‣ Based on statistical probability and our customer insight • Identify the root cause of complaints ‣ Link complaints to business processes and organisational change • Proactively engage customers in high quality conversations ‣ So poor conversations don’t escalate into complaints 40Wednesday, 20 August 14
  • 41. The Opportunities • Derive insight from customer behaviour • Analytic probabilities rather than traditional signals ‣ Correlation over causation • Augment our data with 3rd party intelligence • Derive insight from non-traditional (unstructured) sources 41Wednesday, 20 August 14
  • 42. How Do I Get Started With Big Data? 42Wednesday, 20 August 14
  • 43. Statistically Probable Starting Point • Big Data is not a panacea! • Big Data will not fix your data quality issues ‣ Customer insight requires a single customer! • Start by assessing your current information architecture • Data Integration 43Wednesday, 20 August 14
  • 45. Disparate Data CRM Complaints Financial Provisioning Data Warehouse • Disconnected Data • Repeated Data • Data Quality • Data stewardship • Business Glossary • How do I answer questions? 45Wednesday, 20 August 14
  • 46. Integrated Data CRM Complaints Financial Provisioning Warehouse 3rd Party Unstructured Whatever... 46Wednesday, 20 August 14
  • 47. Getting Started • Focus on... ‣ Data Integration ‣ Data Quality ‣ Master Data Management • Big Data can help these initiatives ‣ ...but you need to reach a minimum threshold before you start 47Wednesday, 20 August 14
  • 48. Big Customer Understanding Customer Enrichment Integration Customer Analytics Conversations Feedback 48Wednesday, 20 August 14
  • 49. What can Big Data do for me? • You don’t need a SKA or 1.23 billion users (like Facebook) to benefit from the approaches adopted by the Big Data organisations • The Big Data toolkit incorporates... • Technology adoption • Statistical modelling • Business change 49Wednesday, 20 August 14
  • 50. What can Big Data do for me? • Understand your customers • Customer segmentation to a segment of one - The Customer • Anticipate their needs • ... and hence their behaviour • Drive high quality conversations with them • Based on your understanding of them 50Wednesday, 20 August 14
  • 51. Big Data Technologies • Business intelligence • Visualisation • Infrastructure • Agile methodologies • New data storage architectures • Parallel processing • Machine learning • Statistical modelling 51Wednesday, 20 August 14
  • 52. Approaches Opportunities Big Data Summary Characteristics • Velocity • Incorporating unstructured data into your analysis • Holistic data rather than sampling • Correlation rather than causation • Complaints root cause analysis • Volume • Improve the quality of conversations • Anticipate behaviour through deep understanding • Variety 52Wednesday, 20 August 14
  • 53. Privacy - Social Media • Legislation will always trail technology • Social Media sites most frequently have “a worldwide, non- exclusive, royalty-free license, with the right to sublicense” • You’ve never paid Facebook or Twitter a cent • They can ‘monetize’ both your content and your metadata • Most legislation centres around self-policing and opt-out 53Wednesday, 20 August 14
  • 54. Privacy - Individuals’ Rights • Staples (US) operate a punitive pricing model ‣ Your IP address tells them if you live in an expensive neighbourhood • OfficeMax addressed marketing to ”Mike Seay, Daughter Killed in Car Crash” • Washington D.C. Police office convicted after looking up licence plates of vehicles near a gay bar and blackmailing the vehicle’s owners 54Wednesday, 20 August 14
  • 55. Privacy - I can buy your... • Full name, spouse, children, ex-partners, co-habitees, current address, previous addresses, ownership status, purchase date and price, outstanding mortgage debt • Job type, income band, credit score, credit and store cards, spending habits, charitable contributions, family events (births, deaths) and likely political affiliation • Ethnicity, primary language, and (in the US) health information! ‣ Cancer, diabetes and clinical depression lists with credit score 55Wednesday, 20 August 14