SlideShare a Scribd company logo
1 of 47
Mastering
the New Matrix:
Big Data
Agenda
The Matrix
Data & Data Everywhere
Big Data & Four Vs
Seeing Big Data
Are you Neo?
2014 © BizKnowlogy - Philip Topham
The Matrix
NEO - Agent Smith
2014 © BizKnowlogy - Philip Topham
Thomas A. Anderson is a man living two lives. By day he is an
average computer programmer and by night a hacker known
as Neo. Neo has always questioned his reality, but the truth is
far beyond his imagination. Neo finds himself targeted by the
police when he is contacted by Morpheus, a legendary
computer hacker branded a terrorist by the government.
Morpheus awakens Neo to the real world, a ravaged
wasteland where most of humanity have been captured by a
race of machines that live off of the humans' body heat and
electrochemical energy and who imprison their minds within
an artificial reality known as the Matrix. As a rebel against the
machines, Neo must return to the Matrix and confront the
agents: super-powerful computer programs devoted to
snuffing out Neo and the entire human rebellion.
2014 © BizKnowlogy - Philip Topham
Agent Smith
Keeps us Trapped In Data
2014 © BizKnowlogy - Philip Topham
Trapped by Data
What is Data?
2014 © BizKnowlogy - Philip Topham
Taste
Data and the Human Senses
Touch Sight SoundSmell
How would you describe a Caribbean island?
2014 © BizKnowlogy - Philip Topham
• Number?
• Spreadsheet?
• Software?
• Smart phone?
What is Data?
2014 © BizKnowlogy - Philip Topham
An abstract
representation of
something…
Five. 5. 五 (go).
• Feet. Miles.
• Pennies. Dollars
• Atomic number1
• Speed. Lightyears.
• Miles Per Gallon.
• Acceleration.
Meters/Second2
Data is…..
1 Boron
2014 © BizKnowlogy - Philip Topham
Five. 5. 五 (go).
• Five Story Building
Data is…
An abstract representation of something…
made meaningful in context
2014 © BizKnowlogy - Philip Topham
Wind Speed
Data Goes Beyond the Five Senses
Temperature GPS Salinity
How would you describe a Caribbean island?
2014 © BizKnowlogy - Philip Topham
Microsensor Data
TasteTouch Sight SoundSmell
Touch screen Digitizer
Motion sensor Accelerometer
Ambient light sensor
Proximity sensor
Digital cameras
Gyroscope
Moisture Sensor
Cellular, WiFi, Bluetooth
2014 © BizKnowlogy - Philip Topham
Sensors, Machines, People
The Internet of Things
Data is Everywhere
2014 © BizKnowlogy - Philip Topham
Is Big Data
The Modern Matrix - Are we Trapped?
2014 © BizKnowlogy - Philip Topham
Humungous
Gargantuan
Immense
Enormous
Colossal
Tremendous
Monumental
Titanic
Elephantine
Brobdingnagian
Mammoth
Extensive
Sizable
Huge
Great
Vast
Voluminous
Spacious
Whopping
Astronomical
What is ???
2014 © BizKnowlogy - Philip Topham
Junk Drawer by Marc Miller
https://www.flickr.com/photos/markmarkmark/4551476025/
Velocity
Variety
Volume
Veracity
video
voice
text
sensors
many
truths
real-time
Four Vs
What is Big Data?
2014 © BizKnowlogy - Philip Topham
Recent History
➡ Hourly
Who attacked our website?
Realtime
➡ Realtime watching
Who is attacking our website now??
➡ Weekly / Monthly
What did you buy this week?
➡ Realtime spending alerts
Are you over spending now?
➡ Preferences
What web add do I show you?
➡ Realtime buying
Make instant suggestions
➡ Daily fraud alerts
Did any fraud happen today?
➡ Realtime fraud detection
Block fraudulent use now.
Velocity
2014 © BizKnowlogy - Philip Topham
Online in 60 Seconds is an infographic that was produced by qmee.com (2013)
From the dawn of time civilization to 2003, humankind
generated five exabytes of data. Now we product five
exabytes every two days…and the pace is accelerating
Eric Schmidt, Executive Director Google
Volume
2014 © BizKnowlogy - Philip Topham
What’s an Exabyte?
One non-stop DVD film, starting
50,000 years ago, when Homo
Sapiens, first arrived in North
America equals one Exabyte!
What’s an Exabyte?
2014 © BizKnowlogy - Philip Topham
Structured
➡ Highly organized data, saved in
repeatable ways that allows for easy
access, manageability, and use
between computer systems.
Unstructured
➡ Recorded data, saved without much
regard for interoperability across
computer systems (a specific program
must be used to use or search the
data)
http://www-i6.informatik.rwth-aachen.de/web/Research/speech_recog.html
Speech recognition waveformData entry form
Variety
2014 © BizKnowlogy - Philip Topham
Structured
➡ Databases
➡ Data Entry forms
➡ Sensor data (at a point in time)
➡ eg. GPS
Unstructured
➡ Freeform text, speech, movies
➡ Sensor data
➡ Chemical sensors
➡ Heat probes
➡ Wave detectors (sound, light, X-
rays, etc.)
➡ Motion (accelerometers,
gyroscopes)
➡ Magnetometers
➡ Pressure
Human data curation Little to no curation
Variety
2014 © BizKnowlogy - Philip Topham
Structured
➡ Known source
➡ Data curated (check for errors)
Unstructured
➡ Uncertain source
➡ Unclear trail of custody
➡ Occasionally purposeful untruths or
half-truths
➡ Hidden meaning (sarcasm)
➡ Unclear encoding or changing
encoding
Human data curation Little to no curation
Veracity
2014 © BizKnowlogy - Philip Topham
Learning to See Big Data
2014 © BizKnowlogy - Philip Topham
Standard Computing
The Old Way
2014 © BizKnowlogy - Philip Topham
Big Data
Does NOT FIT into
One computer
But…60 Trillion Web pages!!
http://www.google.com/insidesearch/howsearchworks/thestory/
2014 © BizKnowlogy - Philip Topham
Thousands
of Computers
Parallel Computing
How did Google Do it?
2014 © BizKnowlogy - Philip Topham
How did Google Do it?
Divide and Conquer
Page Rank
2014 © BizKnowlogy - Philip Topham
How did Google Do it?
Divide and Conquer
Count Words
2014 © BizKnowlogy - Philip Topham
How did Google Do it?
Divide and Conquer
Count Words
frequency
Book 1:
“The quick brown fox…”
Book 2:
“Bill was as quick as a fox”
2014 © BizKnowlogy - Philip Topham
How did Google Do it?
Divide and Conquer: Mapping Data
“The quick
brown
fox…”
“Bill was as
quick as a
fox”
Brown: 1
Fox: 1
Quick: 1
Bill: 1
Fox: 1
Quick: 1
Was: 1
Ignore little words
2014 © BizKnowlogy - Philip Topham
How did Google Do it?
Divide and Conquer: Reducing Results
Brown: 1
Fox: 1
Quick: 1
Bill: 1
Fox: 1
Quick: 1
Was: 1
Bill: 1
Brown: 1
Fox: 2
Quick: 2
Was: 1
A-M
N-Z
2014 © BizKnowlogy - Philip Topham
How did Google Do it?
If they Count Words why Page 17
2014 © BizKnowlogy - Philip Topham
How did Google Do it?
Divide and Conquer
Add Location
Page 17Page 1
2014 © BizKnowlogy - Philip Topham
How did Google Do it?
Divide and Conquer
Apply Information Theory
Dress
Carrot
Lemon
Cupcake +
2014 © BizKnowlogy - Philip Topham
How did Google Do it?
Divide and Conquer
• Count words
• Word Importance
Title, Heading, Font Size
• Geolocation
• Related Importance
2014 © BizKnowlogy - Philip Topham
How did Amazon Do it?
Divide and Conquer
Count BehaviorsWesterns
2014 © BizKnowlogy - Philip Topham
How did Amazon Do it?
Divide and Conquer
Count Behaviors
Did I browse (click)? buy?
2014 © BizKnowlogy - Philip Topham
How did Amazon Do it?
Divide and Conquer
Count Behaviors
AMAZON PATENT US7113917 B22014 © BizKnowlogy - Philip Topham
National League Baseball
2014 © BizKnowlogy - Philip Topham
I’m not Amazon or Google?
Now What?
Are you Neo?
Will you look beyond the data?
Will you create a new story?
2014 © BizKnowlogy - Philip Topham
Big Data Thinking
➡ DATA - Unimportant alone
➡ CONTEXT - Creates meaning
➡ BIG DATA - Is overwhelming. It has no context
➡ BUSINESS - Our world, always has context
➡ DIVIDE and CONQUER - Split into small parts
Five. 5. 五 (go).
Five Feet
Five Gazillion ???
5% Increased Sales
1 + 2 + 3….
2014 © BizKnowlogy - Philip Topham
Big Data Skills
Base Skills:
Mathematics, Algorithms & Data structures
Filter&Mine
Acquire&Clean
Represent
Visualize
Domain Expertise
Data Skills
divide and conquer
2014 © BizKnowlogy - Philip Topham
Big Data Thinking
Patterns Emerge
2014 © BizKnowlogy - Philip Topham
Our Reflected Selves = Aggregated Data in Context
The Modern Matrix
2014 © BizKnowlogy - Philip Topham
ONENEO
2014 © BizKnowlogy - Philip Topham

More Related Content

Similar to Mastering the New Matrix - Big Data

Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...John Sing
 
Big Data & Machine Learning
Big Data & Machine LearningBig Data & Machine Learning
Big Data & Machine LearningAngelo Mariano
 
Pietro leo - Active Intelligence
Pietro leo - Active IntelligencePietro leo - Active Intelligence
Pietro leo - Active IntelligencePietro Leo
 
Mac201 big data
Mac201 big dataMac201 big data
Mac201 big dataRob Jewitt
 
Moving beyond Vulnerability Testing
Moving beyond Vulnerability TestingMoving beyond Vulnerability Testing
Moving beyond Vulnerability TestingCapgemini
 
Renaissance in Medicine - Strata - NoSQL and Genomics
Renaissance in Medicine - Strata - NoSQL and GenomicsRenaissance in Medicine - Strata - NoSQL and Genomics
Renaissance in Medicine - Strata - NoSQL and GenomicsAllen Day, PhD
 
Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013
Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013
Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013Cain Ransbottyn
 
Co-Operative Sytems WhereITsAt event introduction
Co-Operative Sytems WhereITsAt event introductionCo-Operative Sytems WhereITsAt event introduction
Co-Operative Sytems WhereITsAt event introductionCo-Operative Systems
 
Infosecurity2013nl 131103184054-phpapp01
Infosecurity2013nl 131103184054-phpapp01Infosecurity2013nl 131103184054-phpapp01
Infosecurity2013nl 131103184054-phpapp01Kenneth Carnesi, JD
 
Where IT is headed - Philip Anthony, Co-Operative Systems
Where IT is headed - Philip Anthony, Co-Operative SystemsWhere IT is headed - Philip Anthony, Co-Operative Systems
Where IT is headed - Philip Anthony, Co-Operative SystemsCo-Operative Systems
 
eMetrics Summit Boston 2014 - Big Data Marketing - From Über Creepy to Over ...
 eMetrics Summit Boston 2014 - Big Data Marketing - From Über Creepy to Over ... eMetrics Summit Boston 2014 - Big Data Marketing - From Über Creepy to Over ...
eMetrics Summit Boston 2014 - Big Data Marketing - From Über Creepy to Over ...Aurélie Pols
 
Applied Anthropology In Business
Applied Anthropology In BusinessApplied Anthropology In Business
Applied Anthropology In BusinessJohn McEntyre
 
Into the next dimension
Into the next dimensionInto the next dimension
Into the next dimensionEd Charbeneau
 
Simon Harrison RWE - Chain of Things 010616 final
Simon Harrison RWE - Chain of Things 010616 finalSimon Harrison RWE - Chain of Things 010616 final
Simon Harrison RWE - Chain of Things 010616 finalSimon Harrison
 
What I learned about AI, ML and Blockchain from one Wired conference!
What I learned about AI, ML and Blockchain from one Wired conference!What I learned about AI, ML and Blockchain from one Wired conference!
What I learned about AI, ML and Blockchain from one Wired conference!John Powers
 
Chuck Levine's Mobile Presentation
Chuck Levine's Mobile PresentationChuck Levine's Mobile Presentation
Chuck Levine's Mobile Presentationspace150
 
ALPMA - Craig Rispin's Keynote & Workshop 18 Oct 2013
ALPMA - Craig Rispin's Keynote & Workshop 18 Oct 2013ALPMA - Craig Rispin's Keynote & Workshop 18 Oct 2013
ALPMA - Craig Rispin's Keynote & Workshop 18 Oct 2013Craig Rispin
 
Copy of Managing Your Digital Footprint
Copy of Managing Your Digital FootprintCopy of Managing Your Digital Footprint
Copy of Managing Your Digital FootprintJames Webb
 

Similar to Mastering the New Matrix - Big Data (20)

Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
Cloud_Big_Data_Analytics_Mobile_Social_modern_internet_scale_business_models_...
 
Big Data & Machine Learning
Big Data & Machine LearningBig Data & Machine Learning
Big Data & Machine Learning
 
Pietro leo - Active Intelligence
Pietro leo - Active IntelligencePietro leo - Active Intelligence
Pietro leo - Active Intelligence
 
Mac201 big data
Mac201 big dataMac201 big data
Mac201 big data
 
Moving beyond Vulnerability Testing
Moving beyond Vulnerability TestingMoving beyond Vulnerability Testing
Moving beyond Vulnerability Testing
 
Renaissance in Medicine - Strata - NoSQL and Genomics
Renaissance in Medicine - Strata - NoSQL and GenomicsRenaissance in Medicine - Strata - NoSQL and Genomics
Renaissance in Medicine - Strata - NoSQL and Genomics
 
Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013
Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013
Privacy is an Illusion and you’re all losers! - Cryptocow - Infosecurity 2013
 
Co-Operative Sytems WhereITsAt event introduction
Co-Operative Sytems WhereITsAt event introductionCo-Operative Sytems WhereITsAt event introduction
Co-Operative Sytems WhereITsAt event introduction
 
L20 Personalised World
L20 Personalised WorldL20 Personalised World
L20 Personalised World
 
Infosecurity2013nl 131103184054-phpapp01
Infosecurity2013nl 131103184054-phpapp01Infosecurity2013nl 131103184054-phpapp01
Infosecurity2013nl 131103184054-phpapp01
 
Where IT is headed - Philip Anthony, Co-Operative Systems
Where IT is headed - Philip Anthony, Co-Operative SystemsWhere IT is headed - Philip Anthony, Co-Operative Systems
Where IT is headed - Philip Anthony, Co-Operative Systems
 
eMetrics Summit Boston 2014 - Big Data Marketing - From Über Creepy to Over ...
 eMetrics Summit Boston 2014 - Big Data Marketing - From Über Creepy to Over ... eMetrics Summit Boston 2014 - Big Data Marketing - From Über Creepy to Over ...
eMetrics Summit Boston 2014 - Big Data Marketing - From Über Creepy to Over ...
 
Applied Anthropology In Business
Applied Anthropology In BusinessApplied Anthropology In Business
Applied Anthropology In Business
 
Into the next dimension
Into the next dimensionInto the next dimension
Into the next dimension
 
Simon Harrison RWE - Chain of Things 010616 final
Simon Harrison RWE - Chain of Things 010616 finalSimon Harrison RWE - Chain of Things 010616 final
Simon Harrison RWE - Chain of Things 010616 final
 
Situational Awareness meets Public Information
Situational Awareness meets Public InformationSituational Awareness meets Public Information
Situational Awareness meets Public Information
 
What I learned about AI, ML and Blockchain from one Wired conference!
What I learned about AI, ML and Blockchain from one Wired conference!What I learned about AI, ML and Blockchain from one Wired conference!
What I learned about AI, ML and Blockchain from one Wired conference!
 
Chuck Levine's Mobile Presentation
Chuck Levine's Mobile PresentationChuck Levine's Mobile Presentation
Chuck Levine's Mobile Presentation
 
ALPMA - Craig Rispin's Keynote & Workshop 18 Oct 2013
ALPMA - Craig Rispin's Keynote & Workshop 18 Oct 2013ALPMA - Craig Rispin's Keynote & Workshop 18 Oct 2013
ALPMA - Craig Rispin's Keynote & Workshop 18 Oct 2013
 
Copy of Managing Your Digital Footprint
Copy of Managing Your Digital FootprintCopy of Managing Your Digital Footprint
Copy of Managing Your Digital Footprint
 

Recently uploaded

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 

Recently uploaded (20)

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 

Mastering the New Matrix - Big Data

  • 2. Agenda The Matrix Data & Data Everywhere Big Data & Four Vs Seeing Big Data Are you Neo? 2014 © BizKnowlogy - Philip Topham
  • 3. The Matrix NEO - Agent Smith 2014 © BizKnowlogy - Philip Topham
  • 4. Thomas A. Anderson is a man living two lives. By day he is an average computer programmer and by night a hacker known as Neo. Neo has always questioned his reality, but the truth is far beyond his imagination. Neo finds himself targeted by the police when he is contacted by Morpheus, a legendary computer hacker branded a terrorist by the government. Morpheus awakens Neo to the real world, a ravaged wasteland where most of humanity have been captured by a race of machines that live off of the humans' body heat and electrochemical energy and who imprison their minds within an artificial reality known as the Matrix. As a rebel against the machines, Neo must return to the Matrix and confront the agents: super-powerful computer programs devoted to snuffing out Neo and the entire human rebellion. 2014 © BizKnowlogy - Philip Topham
  • 5. Agent Smith Keeps us Trapped In Data 2014 © BizKnowlogy - Philip Topham
  • 6. Trapped by Data What is Data? 2014 © BizKnowlogy - Philip Topham
  • 7. Taste Data and the Human Senses Touch Sight SoundSmell How would you describe a Caribbean island? 2014 © BizKnowlogy - Philip Topham
  • 8. • Number? • Spreadsheet? • Software? • Smart phone? What is Data? 2014 © BizKnowlogy - Philip Topham
  • 9. An abstract representation of something… Five. 5. 五 (go). • Feet. Miles. • Pennies. Dollars • Atomic number1 • Speed. Lightyears. • Miles Per Gallon. • Acceleration. Meters/Second2 Data is….. 1 Boron 2014 © BizKnowlogy - Philip Topham
  • 10. Five. 5. 五 (go). • Five Story Building Data is… An abstract representation of something… made meaningful in context 2014 © BizKnowlogy - Philip Topham
  • 11. Wind Speed Data Goes Beyond the Five Senses Temperature GPS Salinity How would you describe a Caribbean island? 2014 © BizKnowlogy - Philip Topham
  • 12. Microsensor Data TasteTouch Sight SoundSmell Touch screen Digitizer Motion sensor Accelerometer Ambient light sensor Proximity sensor Digital cameras Gyroscope Moisture Sensor Cellular, WiFi, Bluetooth 2014 © BizKnowlogy - Philip Topham
  • 13. Sensors, Machines, People The Internet of Things Data is Everywhere 2014 © BizKnowlogy - Philip Topham
  • 14. Is Big Data The Modern Matrix - Are we Trapped? 2014 © BizKnowlogy - Philip Topham
  • 16. Junk Drawer by Marc Miller https://www.flickr.com/photos/markmarkmark/4551476025/ Velocity Variety Volume Veracity video voice text sensors many truths real-time Four Vs What is Big Data? 2014 © BizKnowlogy - Philip Topham
  • 17. Recent History ➡ Hourly Who attacked our website? Realtime ➡ Realtime watching Who is attacking our website now?? ➡ Weekly / Monthly What did you buy this week? ➡ Realtime spending alerts Are you over spending now? ➡ Preferences What web add do I show you? ➡ Realtime buying Make instant suggestions ➡ Daily fraud alerts Did any fraud happen today? ➡ Realtime fraud detection Block fraudulent use now. Velocity 2014 © BizKnowlogy - Philip Topham
  • 18. Online in 60 Seconds is an infographic that was produced by qmee.com (2013)
  • 19. From the dawn of time civilization to 2003, humankind generated five exabytes of data. Now we product five exabytes every two days…and the pace is accelerating Eric Schmidt, Executive Director Google Volume 2014 © BizKnowlogy - Philip Topham
  • 20. What’s an Exabyte? One non-stop DVD film, starting 50,000 years ago, when Homo Sapiens, first arrived in North America equals one Exabyte! What’s an Exabyte? 2014 © BizKnowlogy - Philip Topham
  • 21. Structured ➡ Highly organized data, saved in repeatable ways that allows for easy access, manageability, and use between computer systems. Unstructured ➡ Recorded data, saved without much regard for interoperability across computer systems (a specific program must be used to use or search the data) http://www-i6.informatik.rwth-aachen.de/web/Research/speech_recog.html Speech recognition waveformData entry form Variety 2014 © BizKnowlogy - Philip Topham
  • 22. Structured ➡ Databases ➡ Data Entry forms ➡ Sensor data (at a point in time) ➡ eg. GPS Unstructured ➡ Freeform text, speech, movies ➡ Sensor data ➡ Chemical sensors ➡ Heat probes ➡ Wave detectors (sound, light, X- rays, etc.) ➡ Motion (accelerometers, gyroscopes) ➡ Magnetometers ➡ Pressure Human data curation Little to no curation Variety 2014 © BizKnowlogy - Philip Topham
  • 23. Structured ➡ Known source ➡ Data curated (check for errors) Unstructured ➡ Uncertain source ➡ Unclear trail of custody ➡ Occasionally purposeful untruths or half-truths ➡ Hidden meaning (sarcasm) ➡ Unclear encoding or changing encoding Human data curation Little to no curation Veracity 2014 © BizKnowlogy - Philip Topham
  • 24. Learning to See Big Data 2014 © BizKnowlogy - Philip Topham
  • 25. Standard Computing The Old Way 2014 © BizKnowlogy - Philip Topham
  • 26. Big Data Does NOT FIT into One computer But…60 Trillion Web pages!! http://www.google.com/insidesearch/howsearchworks/thestory/ 2014 © BizKnowlogy - Philip Topham
  • 27. Thousands of Computers Parallel Computing How did Google Do it? 2014 © BizKnowlogy - Philip Topham
  • 28. How did Google Do it? Divide and Conquer Page Rank 2014 © BizKnowlogy - Philip Topham
  • 29. How did Google Do it? Divide and Conquer Count Words 2014 © BizKnowlogy - Philip Topham
  • 30. How did Google Do it? Divide and Conquer Count Words frequency Book 1: “The quick brown fox…” Book 2: “Bill was as quick as a fox” 2014 © BizKnowlogy - Philip Topham
  • 31. How did Google Do it? Divide and Conquer: Mapping Data “The quick brown fox…” “Bill was as quick as a fox” Brown: 1 Fox: 1 Quick: 1 Bill: 1 Fox: 1 Quick: 1 Was: 1 Ignore little words 2014 © BizKnowlogy - Philip Topham
  • 32. How did Google Do it? Divide and Conquer: Reducing Results Brown: 1 Fox: 1 Quick: 1 Bill: 1 Fox: 1 Quick: 1 Was: 1 Bill: 1 Brown: 1 Fox: 2 Quick: 2 Was: 1 A-M N-Z 2014 © BizKnowlogy - Philip Topham
  • 33. How did Google Do it? If they Count Words why Page 17 2014 © BizKnowlogy - Philip Topham
  • 34. How did Google Do it? Divide and Conquer Add Location Page 17Page 1 2014 © BizKnowlogy - Philip Topham
  • 35. How did Google Do it? Divide and Conquer Apply Information Theory Dress Carrot Lemon Cupcake + 2014 © BizKnowlogy - Philip Topham
  • 36. How did Google Do it? Divide and Conquer • Count words • Word Importance Title, Heading, Font Size • Geolocation • Related Importance 2014 © BizKnowlogy - Philip Topham
  • 37. How did Amazon Do it? Divide and Conquer Count BehaviorsWesterns 2014 © BizKnowlogy - Philip Topham
  • 38. How did Amazon Do it? Divide and Conquer Count Behaviors Did I browse (click)? buy? 2014 © BizKnowlogy - Philip Topham
  • 39. How did Amazon Do it? Divide and Conquer Count Behaviors AMAZON PATENT US7113917 B22014 © BizKnowlogy - Philip Topham
  • 40. National League Baseball 2014 © BizKnowlogy - Philip Topham
  • 41. I’m not Amazon or Google? Now What?
  • 42. Are you Neo? Will you look beyond the data? Will you create a new story? 2014 © BizKnowlogy - Philip Topham
  • 43. Big Data Thinking ➡ DATA - Unimportant alone ➡ CONTEXT - Creates meaning ➡ BIG DATA - Is overwhelming. It has no context ➡ BUSINESS - Our world, always has context ➡ DIVIDE and CONQUER - Split into small parts Five. 5. 五 (go). Five Feet Five Gazillion ??? 5% Increased Sales 1 + 2 + 3…. 2014 © BizKnowlogy - Philip Topham
  • 44. Big Data Skills Base Skills: Mathematics, Algorithms & Data structures Filter&Mine Acquire&Clean Represent Visualize Domain Expertise Data Skills divide and conquer 2014 © BizKnowlogy - Philip Topham
  • 45. Big Data Thinking Patterns Emerge 2014 © BizKnowlogy - Philip Topham
  • 46. Our Reflected Selves = Aggregated Data in Context The Modern Matrix 2014 © BizKnowlogy - Philip Topham
  • 47. ONENEO 2014 © BizKnowlogy - Philip Topham