Why Big Data Will Survive the Hype - and Change the Way We Work

•Download as PPTX, PDF•

0 likes•505 views

Blair Reeves

This deck accompanied my presentation on big data at the Digital Analytics Association NYC Symposium on 12.4.13.

Technology

(and live)

(and think)

The City University of New York
New York City
December 4, 2013

@BlairReeves

Blair Reeves

Product Lead, IBM Digital Analytics

IBM.com/digitalmarketing
I live here:
Durham, North Carolina

@BlairReeves

“The Year of Big Data”

Credit: Gartner Research

… is every year. From now on.
@BlairReeves

The Value of Data is Increasing

@BlairReeves

The Value of Data
… is still being decided

Book Value: $13 billion

Market Value: $114 billion
= 1.3 billion MAUs
~500 terabytes of data added… per day

$101 billion in data
@BlairReeves

A Short History of Data
300 B.C.
Great Library of Alexandria (Egypt)

970 A.D.
Al-Azhar University (Egypt)

1400
Cambridge University owns 122 books

1450s
Invention of the Gutenberg printing
press

1520s
Martin Luther translates the Latin Bible, accelerating
mass literacy

1710
Copyright law is born

1770s
Press freedom guarantees; pamphleteering

1890
Herman Hollerith invents machine-readable
data for U.S. Census

1969
ARPANET – first TCP/IP Protocol

2013
Watson

~2.8 billion global internet users
(40% of world’s population)

@BlairReeves

The Way We Use Data Will Change

Trade Exactitude for Size
Why Sample?

Correlation Over Causality

@BlairReeves

1 – Trade Exactitude for Size

Precision < Size
More data > Better algorithms

@BlairReeves

1 – Trade Exactitude for Size
1954

1990

250 word pairs

2006

3 million word pairs

>100 billion word pairs
(and counting)

@BlairReeves

2 – Why Sample?
• Sampling relies on randomness
• Difficult to drill down into
subcategories
• Requires careful pre-planning

@BlairReeves

2 – Why Sample?
• Sumo wrestlers
• Google Flu
• Non-linear relationships
(social media)

@BlairReeves

3 – Correlation Over Causality

When does knowing “why” matter?
Data rather than hypotheses

Correlations are value

@BlairReeves

3 – Correlation Over Causality

A/B Testing

Attribution

@BlairReeves

“Everything is obvious once you
know the answer.”
- Duncan Watts

@BlairReeves

Thanks!
BReeves@us.ibm.com
@BlairReeves

IBM.com/digitalmarketing
IBMBigDataHub.com

Similar to Why Big Data Will Survive the Hype - and Change the Way We Work

Big Data in the Arts and HumanitiesAndrew Prescott

Rc 11.networksBill Kovarik

Brief History Of Big DataTyrone Systems

Big Data in the Arts and Humanities: Stirling presentationAndrew Prescott

A Brief History of Big DataBernard Marr

Big Data in the Arts and HumanitiesAndrew Prescott

Briefhistoryofbigdata 150223152350-conversion-gate02Mohammad Alkhalifah

The Big Data EconomyEmcien Corporation

AHRC CDP Digital Humanities 101 Digital Research and Curator Team @ British Library

Module 1 Introduction to Big and Smart Data- Online caniceconsulting

E-Learning Prácticas y PromesasDaniel Osorio

Bigdataforesightsuresh sood

What is the Internet.pptgrendel3

new chap16.pptasastm2015

The-Information-Age.pptxMariePrincessTherese

$It\'s Your Move$ $It\'s Your Move$

It\'s Your MovePaul Schumann

10 Jahre Web ScienceSteffen Staab

Internet based communicationkurt nickson Quisumbing

Internet based communicationChristian Mark Llosala

History of the internetAmal Jith

Similar to Why Big Data Will Survive the Hype - and Change the Way We Work (20)

Big Data in the Arts and Humanities

Rc 11.networks

Brief History Of Big Data

Big Data in the Arts and Humanities: Stirling presentation

A Brief History of Big Data

Big Data in the Arts and Humanities

Briefhistoryofbigdata 150223152350-conversion-gate02

The Big Data Economy

AHRC CDP Digital Humanities 101

Module 1 Introduction to Big and Smart Data- Online

E-Learning Prácticas y Promesas

Bigdataforesight

What is the Internet.ppt

new chap16.ppt

The-Information-Age.pptx

$It\'s Your Move$ $It\'s Your Move$

It\'s Your Move

10 Jahre Web Science

Internet based communication

History of the internet

Recently uploaded

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

Histor y of HAM Radio presentation slidevu2urc

What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

Tech Trends Report 2024 Future Today Institute.pdfhans926745

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Histor y of HAM Radio presentation slide

What Are The Drone Anti-jamming Systems Technology?

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Tech Trends Report 2024 Future Today Institute.pdf

Strategies for Landing an Oracle DBA Job as a Fresher

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

Axa Assurance Maroc - Insurer Innovation Award 2024

Data Cloud, More than a CDP by Matt Robison

Automating Google Workspace (GWS) & more with Apps Script

The 7 Things I Know About Cyber Security After 25 Years | April 2024

How to Troubleshoot Apps for the Modern Connected Worker

Boost PC performance: How more available memory can improve productivity

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024

presentation ICT roal in 21st century education

Handwritten Text Recognition for manuscripts and early printed texts

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Why Big Data Will Survive the Hype - and Change the Way We Work

1. (and live) (and think) The City University of New York New York City December 4, 2013 @BlairReeves

2. Blair Reeves Product Lead, IBM Digital Analytics IBM.com/digitalmarketing I live here: Durham, North Carolina @BlairReeves

3. “The Year of Big Data” Credit: Gartner Research … is every year. From now on. @BlairReeves

4. The Value of Data is Increasing @BlairReeves

5. The Value of Data … is still being decided Book Value: $13 billion Market Value: $114 billion = 1.3 billion MAUs ~500 terabytes of data added… per day $101 billion in data @BlairReeves

6. A Short History of Data 300 B.C. Great Library of Alexandria (Egypt) 970 A.D. Al-Azhar University (Egypt) 1400 Cambridge University owns 122 books 1450s Invention of the Gutenberg printing press 1520s Martin Luther translates the Latin Bible, accelerating mass literacy 1710 Copyright law is born 1770s Press freedom guarantees; pamphleteering 1890 Herman Hollerith invents machine-readable data for U.S. Census 1969 ARPANET – first TCP/IP Protocol 2013 Watson ~2.8 billion global internet users (40% of world’s population) @BlairReeves

7. The Way We Use Data Will Change Trade Exactitude for Size Why Sample? Correlation Over Causality @BlairReeves

8. 1 – Trade Exactitude for Size Precision < Size More data > Better algorithms @BlairReeves

9. 1 – Trade Exactitude for Size 1954 1990 250 word pairs 2006 3 million word pairs >100 billion word pairs (and counting) @BlairReeves

10. 2 – Why Sample? • Sampling relies on randomness • Difficult to drill down into subcategories • Requires careful pre-planning @BlairReeves

11. 2 – Why Sample? • Sumo wrestlers • Google Flu • Non-linear relationships (social media) @BlairReeves

12. 3 – Correlation Over Causality When does knowing “why” matter? Data rather than hypotheses Correlations are value @BlairReeves

13. 3 – Correlation Over Causality A/B Testing Attribution @BlairReeves

14. “Everything is obvious once you know the answer.” - Duncan Watts @BlairReeves

15. Thanks! BReeves@us.ibm.com @BlairReeves IBM.com/digitalmarketing IBMBigDataHub.com

Editor's Notes

No strict definition of the term – merely refers to the process (or capability) of analyzing datasets so large that they couldn’t previously fit into computer memory. This is where we got Google MapReduce and Hadoop. Technology companies who pioneered these techniques thus were able to extract unique new value from huge troves of data that many “offline” companies in a wide number of sectors had kept for years.
Today, up to a third of Amazon’s online revenue is derived from its personalization and recommendations engine.Case studiesYou can cite any number of case studies about how innovative companies have been able to extract new value from large, previously unremarkable datasets. But in any of these cases, what we see is that data has become the newest natural resource, and it’s being exploited to create new markets.
Interestingly, guess how many companies have a line item on their balance sheets for “data?” None. FB is one of the single best examples of this mismatch between traditional systems of financial value and new ones. Intangible assets 40% of value of public companies in 1980s; 75% of their value in 2010s
As human societies consume, generate and process more data, our political, legal and conceptual models must change along with them. While it took hundreds of years for mass literacy and printed information to change Western civilization, we are now living in an era where amounts of and access to data are completely unprecedented. It will change how we think about the nature of information itself.8M books printed from 1453 to 1503Hollerith shrunk tabulating times for the U.S. Census from 8 years to <1.
Interestingly, guess how many of the companies listed here have a line item on their balance sheets for “data?” None.
Collecting more data, more often frequently means sacrificing some level of precision. At large scale, accepting some noise – messiness – in exchange for collecting a larger dataset can mean better predictive power.NoSQL
IBM 701 Machine – punch card system. Translated 60 sentences smoothly.IBM Candide – ten years worth of Canadian parliamentary transcripts. Ultimately was difficult to scale due to lack of additional data.Google Translate uses billions of websites, book-scanning project. In 2013, covers more than 60 languages.
Sampling is sometimes a definitional characteristic of what qualifies as “big data” – whether we’re querying an entire dataset rather than a select part of it.Sampling is still very useful sometimes, but always as a second-best alternative to querying an entire dataset. Artifact of data-constrained environment where storage and processing power was sharply limited
Up to a third of all Amazon’s sales are a result from its recommendation and personalization engines. These product-to-product correlations matter far more than understanding WHY customers who buy one product like another.

Why Big Data Will Survive the Hype - and Change the Way We Work

Recommended

Recommended

More Related Content

Similar to Why Big Data Will Survive the Hype - and Change the Way We Work

Similar to Why Big Data Will Survive the Hype - and Change the Way We Work (20)

Recently uploaded

Recently uploaded (20)

Why Big Data Will Survive the Hype - and Change the Way We Work

Editor's Notes