SlideShare a Scribd company logo
BIG DATA | How to explain it & how to use it
for your career?
NetCom Learning
NetCom Learning – Managed Learning
Services
Today’s Agenda
If you ask people what BIG DATA is they often say it is about a lot of
data. But the world has ALWAYS had a lot of data! It is about
datafication – a word so new that even spellcheck functions don’t
know it’s a real word!
Today’s Agenda
 How BIG DATA changes career paths of even the most unsuspecting!
 How BIG DATA changes the way business decision are made.
 How BIG DATA changes who makes the decisions & the reshuffling balance of power.
 What BIG DATA skills can you bring to the office tomorrow to increase your value.
The experienced
Data scientists &
those managers
who leverage
them.
BIG DATA is a management tool even if you have other employees perform
the coding.
BIG DATA is as ubiquitous as the internet.
Gut instinct now
of less value
Datafication
A modern technological trend turning
many aspects of our life into computerized
data that transforms respective
information into new forms of value.
Data
Information
Knowledge
Wisdom
Insight
Knowledge—Wisdom--Insight Vincent Suppa
This is the fulcrum that changes everything.
Knowledge
Information
data
Insight
Wisdom
Actionable
Insight
BIG DATA
A Metaphor / Illustration
Diagraming an
Algorithm
Diagraming an
Algorithm
activity or
purpose natural
to or intended
for a person or
thing.
relationship or
expression
involving one
or more
variables.
Algorithm Script
Just as voice mail and email obviated the manager’s need of
secretarial functions  algorithms eating BIG DATA are now
obviating tactical managerial functions.
Transactional
Work
Tactical
Work
Strategy needs to consume data.
Data, without strategy, has little value.
Modified sine wave
Sine wave
What is the
difference between
analogue
and
digital?
Datafication
only possible due to digitalization of
analogue informaton.
Digital versus Analogue
Interprets continuous sine wave as a digital recreation.
This photo was
taken on film – not
a digital camera.
Are there data points within this
“single” data point?
Social
Construct
Another
example
of social
construct
Now to the
show.
Big data: broad term
for data sets so large &
complex that
traditional data
processing applications
are inadequate.
A terabyte, petabyte &
gigabyte walk into a bar...
Yotta
Zetta
Exa
Giga
Tera
Peta
To give us a sense of scale.
Yottabyte is
1,000 trillion gigabytes
Giga
Tera
Peta
Exa
Zetta
Yotta
Mega
Kilo
The Least You Need to Know About BIG DATA
BIG DATA manifests 3 basic shifts:
 From Small to All
 Clean to Messy
 Causation to Correlation
V. Suppa The Definitive 90 Thousand Foot Lecture on BIG Data© 2014
Scope of Traditional Data
 Data growth analogous to y = tan x.
 In 2000, ¼ of world’s information digital; reminder preserved in analog.
 digital data doubles around every 3 years
 In 2014 less than 2% of all stored information is analog. (And now we’re in 2017!)
The Definitive 90 Thousand Foot Lecture on BIG Data© 2014
Big Data is Not About Lots of Data
 Lots of data existed before Big Data!
 Big Data: ability to render aspects of life into data points
never quantified before.
 This is DATAFICATION … your new word of the day!
V.The Definitive 90 Thousand Foot Lecture on BIG Data© 2014
DATAFICATION
.
V.pa The Definitive 90 Thousand Foot Lecture on BIG Data© 2014
Location was datafied
before GPS was invented
 Words
treated as
data.
 Friendships
& likes
datafied, via
Facebook
 Shigeomi Koshimizu datafied body contour (body,
posture, weight distribution, etc.).
 Quantified “sitting down.” Measured pressure drivers
exert at 360 different points via sensors (0 to 256 scale).
Quality  Quantify
Datafication Turns Everything into a Data Point
Tools of Datafication
 inexpensive computers (commodity)
 powerful processors (commodity)
 basic statistics (commodity)
 clever software (commodity)
 smart algorithm (differentiator)
Lots of Data versus BIG DATA
Computers computing lots of data:
Teaching computer to translate by inputting bilingual dictionaries
Computers computing BIG DATA
Feed computer years of Canadian parliamentary transcripts French / English)
Statically program it to infer which word of English is best alternative to French
The Definitive 90 Thousand Foot Lecture on BIG Data© 2014
In context, French word lumiere
more appropriate substitute for
the English work light than
leger.
Isn’t this
how a
person
translates?
A Quick Review & then … Causation to Correlation
 sampling population  entire population
 pristine data  non curated messy data
 causation  correlation
Reasons on how the world works replaced with learning about
association among phenomena
 Knowing cause “is” desirable.
 But cause is harder to figure out
 Cause as illusion? Cognitive bias
V The Definitive 90 Thousand Foot Lecture on BIG Data© 2014
Saving Trucks Saving Babies
Saving
Epidemics
Saving
Buildings
Place sensors on parts to identify heat
& vibrational patterns associated with
failures leading to breakdowns.
Can predict a breakdown before it happens &
replace parts in garage & not on side of the road.
 Data does not tell us why the part is in trouble
 It reveals enough to know the what
 Can guide investigations into discovering underlying cause
Causation to Correlation
 When saving lives, knowing something is likely
to occur more important than knowing why.
 Eventually, “the why” will be investigated.
Can Big Data Save Babies?
Used Big Data to spot infections in premature
babies before symptoms appear.
 Information flow >1000 data points per second
 Discovered correlations between very minor changes and more serious problems
Big Data Predicts Epidemics Better than CDC
CDC tracks patient visits to clinics
Information suffers from 2 week reporting lag
Google took 50 mm most commonly searched terms from 2003 – 2008
Compared them against historical influenza data from CDC.
Searches then correlated with CDC’s data on outbreaks of flu.
How All Three Shifts Are Illustrated
Small to All
Ran 100% of US searches for 6 years through an algorithm
identified 45 searches correlated against CDC data on flu outbreak (runny nose,
body aches, etc. - ).
Clean to Messy
Searches imperfect with misspellings, incomplete phrases & included healthy
people searching on behalf of others.
Causation to Correlation
Will anyone claim typing symptoms in a search engine gives you the flu?
Big Data via searches predicts outbreaks real time compared to
CDC’s traditional data analytics that lag 2 week lag
Illegally subdivided buildings
more likely to catch fire.
200 inspectors to respond to 25K
complaints / year wrt overcrowded
buildings.
NYC created database of 900K buildings augmented
by troves of data collected by 19 agencies:
• Records of tax liens
• Anomalies in utility usage
• Service cuts
• Missed payments
• Ambulance visits
• Local crime rates
• Rodent complaints
• Etc.
Big Data
increases the
productivity of
each inspector
How Did They Do It?
1. Compared database (5 years of building fires)
2. Ranked by severity
3. Observed correlation. (Not causality!)
4. Data scientists triaged complaints for inspections.
Concluded that a building’s:
 type & age main predictor of fire; other variables superfluous
 permit for exterior brickwork correlated lower risk of fire.
Result: Vacate orders increased from 13% to 70%
Building characteristics did not cause fire but were correlated with fire risk.
Spending money on the exterior
correlates for an up to code interior
But just the intent to begin work
correlates enough to predict an outcome
Pull disparate sets of texts & puts them into a
“point of singularity.”
Currently ae 70% of data is text. Pictures to be
quantified under separate protocols.Create a Corpus  body of text to
be analyzed.
R, for example, has set of functions to clean up a Corpus by excluding data points
superfluous to analysis. (Delete commas, periods & words such as but & and, etc. –
R cleans up files by reducing corpus to primary words crucial to analysis.
Truncates words with common stem  this is called stemming. (e.g. engineer &
engineering both become the same word. Think of mathematical analogy of
number factoring versus least common dominator.
1
2
3
4Mathematical matrix to describes frequency of
terms that occur in a collection of documents.
Rows correspond to documents in the collection
& columns correspond to terms.
Create a document term matrix that measures
frequency of words that remain after corpus
“cleanup” discussed in previous slide.
4
You are left with primary
outputs that enable you to do
counts in each cell.
You’ve datafied or quantified
words that others only qualify
that prevents analysis.
You can now do lots of
interesting stuff!
Term document matrix cluster
analysis reveals prevalent themes.
Document-term matrix
Cluster analysis  review at how all your words cluster in your data matrix cluster.
The result of this analysis is that we can reduce our matrix to fewer columns.
Font Size & even
Color embedded
with information.
This information
is actionable.
For centuries we have manually counted sets of
words to determining their frequencies.
Zipf's law states that given some corpus of
natural language utterances, the frequency of any
word is inversely proportional to its rank in the
frequency table.
Used for resumes as a way to
increase information density – to
be covered at a future webinar.
 With these data sets, we can run sentiment analysis!
 Determine occurrence rate of certain themes qualified as opinions.
 To determine if people like a restaurant we’d look at words
reviewers used via social media in the comment section.
Love
10
Hate
-10
Dislike
- 7
Qualitatively, we quantify the
weakness or strength of these signals.
We determine words that correlate to
having disliked or liked the movie and
to what degree along a predetermined
discreet continuum .
Pre-establish words in
narrative responses now
embedded in clusters
signal positive or negative
statements about a movie,
restaurant or Hammacher
Schlemme customer
review.
Like
7
The difference between analog and digital signals is that
an analog signal is a continuous electrical message while digital is a
series of values that represent information.
To determinate what traits can predict future outcomes, look at historical data.
Correlate “judgements” to see if they can predict from groupings, meaning which
ones predict against other dataset.
This is cross validation and is determined by looking at historical data sets.
Master Algorithms script other
algorithms on an at need basis
free of human interaction.
Machine to machine (M2M) technology that
enables networked devices to exchange
information & perform actions without the
manual assistance of humans.
This is what is replacing traditional
managerial jobs.
Firms that still employ these types of
jobs feel less pressure to keep salaries at
pace with inflation over time.
Machine learning can test statistical models. ….. for
example, testing against known political party membership
& updating the algorithm as new data comes in.
In M2M, we let data points come in, refresh & update to
automatically script even more accurate algorithm.
Can infer your political affliction by
first 19 likes even if those likes are
completely apolitical.
What Can I Do Tomorrow Morning at the Office?
1. Take inventory of the data you already collect
A. Internal data.
B. External data accessed from FOI Act – to be discuss subsequently.
C. External data legally purchased from vendors (Yelp, FB, Double Click, etc.) -
D. Create glossary of data definition. (headcount example)
2. Determine decisions to derive from Big Data
A. Select most pressing problem based on Pareto 80/20 rule.
B. In plain English, state your problem statement.
C. Write down independent variables (inventory set of data at your disposal.)
D. Determine dependent variable (preferred outcome to your problem statement.)
3. Write down your hypothesis
4. Contact your IT or data science department. If not …..
5. Contract STEM grad students & turn them into data scientists
6. Code your hypothesis
Even if I hate coding and math!
QuantitativeSkills
The Freedom of Information
Act (FOIA), 5 U.S.C. § 552, is a
federal law that allows for
disclosure of previously
unreleased information controlled
by the US government.
Correlate to external
data with troves of data
from US gov’t.
(Examples: MTA apps)!
Enacted in 1966, allows
U.S. citizens to petition
government for official
information.
Business problem you are trying to solve in plain language stated as a
problem statement
State it in a hypothesis.
Collect Data, from systems
already set in place.
Test hypothesis
Coding is
the new
literacy.
Coding Classes.
Most are on-line, a
few on-site.
Some free & some
at cost.
Most of you will not be competing
with other coders – just other
Marketing, HR or Financial
professionals who know nothing
about coding!
Should I learn to read?
Should I learn how to use the internet?
Should I learn about coding?
A little about R• R – Free
• Contains embedded tools to pull external data
• Tools that scrape data from any website, (Reuters, as one example)
• Text Mining: Knime (another software tool for text mining) – you can
download it. (pronounced like 9 but with a “m”. Has graphical interface
instead of using a scripting language.)
• Remember, Word Clouds is an example of text mining.
• R was written in C language – coders wrote functions in “C” to create
macros in R to pull data - analogous to a macro in excel.
• R will let you pull data into a corpus.
KNIME - Konstanz Information Miner  open source data analytics, reporting & integration
platform. It integrates various components for machine learning & data mining.
You’re not competing against other coders.
You’re competing against others in your field that know
nothing about coding.
Facebook accomplished
what democratic gov’t
tried but failed to do –
build a database of
citizens.
Datafication turns all aspects of life & turns it into data.
Google’s
augmented reality
glasses datafy the
gaze
Twitter datafies
stray thoughts
LinkedIn
datafied
professional
networks
The Floor as a Giant iPad via
surface based computing technology
V. The Definitive 90 Thousand Foot Lecture on BIG Data© 2014
No thank you,
I’m just looking .
That’s okay,
I’m datafying
your every
move.
Touch sensitive floor
customers walk on
Download a Coupon Texted to You
• What aisles did you walk down or ignore?
• In what sequence did you browse the aisles?
• How long were you in the store?
• What is length of time between store visits?
• How long did you linger in front of the cereal aisle?
• When you checked out, did cereal wind up your cart? How many boxes?
Compare viewing patterns with what wound up in your shopping cart.
Script algorithms to better predicts independent variables (what
they stock) with the depend variable of revenue thresholds.
So, what’s my role again in a Big Data World?
As Big Data becomes ubiquitous what skills mark points of differentiations?
 Discovering latent needs & intuition that goes against the facts?
 The mere ability to define a problem proceeds its solution
Big Data has a quantitative & qualitative side
And if you hate math - qualitative skills to harness
 Develop observational skills to separate signal from the noise
 Take inventory of existing data
 Learn to develop hypotheses to test
 Learn how to access external data (FOIA. LinkedIn, etc. - )
 Liaison between internal ERP data & external data
 Network with STEM student to contract data scientists
Your Role in a Big Data World
If Ford queried BIG DATA to discover what customers want, he’d
come up with faster horses who required less water.
In Big Data world, traits to be developed:
 Creativity
 Intuition
 Intellectual curiosity
 Leveraging errors
 Risk taking
V The Definitive 90 Thousand Foot Lecture on BIG Data© 2014
Read outside your discipline
non sequitur
Vincent Suppa © 2016
Don’t be afraid to fail.
Business is not figure skating. It’s the X games!
If you fail, have quick-to-market failures that
mitigate loss & allow you to harvest what did work
for the next initiative.
Capitalism without failure is
like a religion without sin.
Recommended Courses
NetCom Learning offers a comprehensive portfolio for Big Data training
options. Please see below the list of recommended courses with upcoming
schedules:
Introduction to Python Programming
Essential Python
Introduction to Python Scripting: for the Security Analyst
Check out more Big Data training options with NetCom Learning. CLICK HERE
Our live webinars will help you to touch base a wide variety of IT, soft skills and business
productivity topics; and keep you up to date on the latest IT industry trends. Register
now for our upcoming webinars:
A Brief on Benefits of ITIL for the Organization – April 4
Visualization with Tableau to Enhance Efficiency in Organization – April 6
How Machine Learning Helps Organizations to Work More Efficiently? – April 11
Why Certified Associate in Project Management (CAPM) and How to Prepare? - April 18
A Brief About DevOps and its Practices – April 20
Special Promotion
Whether you're learning new IT or Business skills, or you are developing
a learning plan for your team, for limited time, register for our
Guarantee to Run classes and get 25% off on the course price.
Learn more»
To get latest technology updates, please follow our social media pages!
THANK YOU !!!

More Related Content

What's hot

Using AI to Solve Data and IT Complexity -- And Better Enable AI
Using AI to Solve Data and IT Complexity -- And Better Enable AIUsing AI to Solve Data and IT Complexity -- And Better Enable AI
Using AI to Solve Data and IT Complexity -- And Better Enable AI
Dana Gardner
 
Big data analytics 1
Big data analytics 1Big data analytics 1
Big data analytics 1
gauravsc36
 
OWF14 - Plenary Session : Ori Pekelman, Founder, Constellation Matrix
OWF14 - Plenary Session : Ori Pekelman, Founder, Constellation MatrixOWF14 - Plenary Session : Ori Pekelman, Founder, Constellation Matrix
OWF14 - Plenary Session : Ori Pekelman, Founder, Constellation Matrix
Paris Open Source Summit
 
2019 June 27 - Big data and data science
2019 June 27 - Big data and data science2019 June 27 - Big data and data science
2019 June 27 - Big data and data science
Fabio Stella
 
Less is More: Behind the Data at Risk I/O
Less is More: Behind the Data at Risk I/OLess is More: Behind the Data at Risk I/O
Less is More: Behind the Data at Risk I/O
Michael Roytman
 
Big data v4.0
Big data v4.0Big data v4.0
Big data v4.0
Ian Brown
 
Big Data and the Social Sciences
Big Data and the Social SciencesBig Data and the Social Sciences
Big Data and the Social Sciences
Abe Usher
 
Mac201 big data
Mac201 big dataMac201 big data
Mac201 big data
Rob Jewitt
 
Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science
Booz Allen Hamilton
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Data Science London
 
Data validation in the Digital Age
Data validation in the Digital AgeData validation in the Digital Age
Data validation in the Digital Age
J T "Tom" Johnson
 
The Next Big Thing in Big Data
The Next Big Thing in Big DataThe Next Big Thing in Big Data
The Next Big Thing in Big Data
Pentaho
 
Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data ScienceEdureka!
 
Big Data
Big DataBig Data
Data Science 101
Data Science 101Data Science 101
Data Science 101
Virot "Ta" Chiraphadhanakul
 
2014 aus-agta
2014 aus-agta2014 aus-agta
2014 aus-agta
c.titus.brown
 
Data Science and Culture
Data Science and CultureData Science and Culture
Data Science and Culture
Ícaro Medeiros
 
Big Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesBig Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our Lives
Rukshan Batuwita
 
REST and eHealth
REST and eHealthREST and eHealth
REST and eHealth
Andreas Ruppen
 
2951085 dzone-2016guidetobigdata
2951085 dzone-2016guidetobigdata2951085 dzone-2016guidetobigdata
2951085 dzone-2016guidetobigdata
balu kvm
 

What's hot (20)

Using AI to Solve Data and IT Complexity -- And Better Enable AI
Using AI to Solve Data and IT Complexity -- And Better Enable AIUsing AI to Solve Data and IT Complexity -- And Better Enable AI
Using AI to Solve Data and IT Complexity -- And Better Enable AI
 
Big data analytics 1
Big data analytics 1Big data analytics 1
Big data analytics 1
 
OWF14 - Plenary Session : Ori Pekelman, Founder, Constellation Matrix
OWF14 - Plenary Session : Ori Pekelman, Founder, Constellation MatrixOWF14 - Plenary Session : Ori Pekelman, Founder, Constellation Matrix
OWF14 - Plenary Session : Ori Pekelman, Founder, Constellation Matrix
 
2019 June 27 - Big data and data science
2019 June 27 - Big data and data science2019 June 27 - Big data and data science
2019 June 27 - Big data and data science
 
Less is More: Behind the Data at Risk I/O
Less is More: Behind the Data at Risk I/OLess is More: Behind the Data at Risk I/O
Less is More: Behind the Data at Risk I/O
 
Big data v4.0
Big data v4.0Big data v4.0
Big data v4.0
 
Big Data and the Social Sciences
Big Data and the Social SciencesBig Data and the Social Sciences
Big Data and the Social Sciences
 
Mac201 big data
Mac201 big dataMac201 big data
Mac201 big data
 
Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
 
Data validation in the Digital Age
Data validation in the Digital AgeData validation in the Digital Age
Data validation in the Digital Age
 
The Next Big Thing in Big Data
The Next Big Thing in Big DataThe Next Big Thing in Big Data
The Next Big Thing in Big Data
 
Introduction on Data Science
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data Science
 
Big Data
Big DataBig Data
Big Data
 
Data Science 101
Data Science 101Data Science 101
Data Science 101
 
2014 aus-agta
2014 aus-agta2014 aus-agta
2014 aus-agta
 
Data Science and Culture
Data Science and CultureData Science and Culture
Data Science and Culture
 
Big Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesBig Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our Lives
 
REST and eHealth
REST and eHealthREST and eHealth
REST and eHealth
 
2951085 dzone-2016guidetobigdata
2951085 dzone-2016guidetobigdata2951085 dzone-2016guidetobigdata
2951085 dzone-2016guidetobigdata
 

Similar to BIG DATA | How to explain it & how to use it for your career?

365 Data Science
365 Data Science365 Data Science
365 Data Science
IvanHo572682
 
Big data upload
Big data uploadBig data upload
Big data upload
Bhavin Tandel
 
Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.
saranya270513
 
Implementation of application for huge data file transfer
Implementation of application for huge data file transferImplementation of application for huge data file transfer
Implementation of application for huge data file transfer
ijwmn
 
Big Data Analysis
Big Data AnalysisBig Data Analysis
Big Data Analysis
IRJET Journal
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
Sandip Tipayle Patil
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
Mohit Saini
 
Semantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data ContextSemantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data Context
Murad Daryousse
 
Data Science - Part XI - Text Analytics
Data Science - Part XI - Text AnalyticsData Science - Part XI - Text Analytics
Data Science - Part XI - Text Analytics
Derek Kane
 
BigData Analytics_1.7
BigData Analytics_1.7BigData Analytics_1.7
BigData Analytics_1.7Rohit Mittal
 
big-data.pdf
big-data.pdfbig-data.pdf
big-data.pdf
aditi276464
 
Becoming Datacentric
Becoming DatacentricBecoming Datacentric
Becoming Datacentric
Timothy Cook
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
Hari Priya
 
Data science
Data scienceData science
Data science
DeekshaSrivas
 
Creating a Data-Driven Government: Big Data With Purpose
Creating a Data-Driven Government: Big Data With PurposeCreating a Data-Driven Government: Big Data With Purpose
Creating a Data-Driven Government: Big Data With Purpose
Tyrone Grandison
 
Bigdata Hadoop introduction
Bigdata Hadoop introductionBigdata Hadoop introduction
Bigdata Hadoop introduction
Sunitha Mutchintala
 
computer projecttttttttttttttttttttttttttttttttttttttttt
computer projectttttttttttttttttttttttttttttttttttttttttcomputer projecttttttttttttttttttttttttttttttttttttttttt
computer projecttttttttttttttttttttttttttttttttttttttttt
SugatShakya5
 

Similar to BIG DATA | How to explain it & how to use it for your career? (20)

365 Data Science
365 Data Science365 Data Science
365 Data Science
 
Big data upload
Big data uploadBig data upload
Big data upload
 
Big Data-Job 2
Big Data-Job 2Big Data-Job 2
Big Data-Job 2
 
Introduction to big data – convergences.
Introduction to big data – convergences.Introduction to big data – convergences.
Introduction to big data – convergences.
 
Final_Bigdata_pret
Final_Bigdata_pretFinal_Bigdata_pret
Final_Bigdata_pret
 
Implementation of application for huge data file transfer
Implementation of application for huge data file transferImplementation of application for huge data file transfer
Implementation of application for huge data file transfer
 
Big Data Analysis
Big Data AnalysisBig Data Analysis
Big Data Analysis
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
Semantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data ContextSemantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data Context
 
Data Science - Part XI - Text Analytics
Data Science - Part XI - Text AnalyticsData Science - Part XI - Text Analytics
Data Science - Part XI - Text Analytics
 
BigData Analytics_1.7
BigData Analytics_1.7BigData Analytics_1.7
BigData Analytics_1.7
 
big-data.pdf
big-data.pdfbig-data.pdf
big-data.pdf
 
Becoming Datacentric
Becoming DatacentricBecoming Datacentric
Becoming Datacentric
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
 
Data science
Data scienceData science
Data science
 
Creating a Data-Driven Government: Big Data With Purpose
Creating a Data-Driven Government: Big Data With PurposeCreating a Data-Driven Government: Big Data With Purpose
Creating a Data-Driven Government: Big Data With Purpose
 
Bigdata Hadoop introduction
Bigdata Hadoop introductionBigdata Hadoop introduction
Bigdata Hadoop introduction
 
computer projecttttttttttttttttttttttttttttttttttttttttt
computer projectttttttttttttttttttttttttttttttttttttttttcomputer projecttttttttttttttttttttttttttttttttttttttttt
computer projecttttttttttttttttttttttttttttttttttttttttt
 

More from Tuan Yang

Learn How to Configure Cisco Data Center Core Networking(Handouts).pdf
Learn How to Configure Cisco Data Center Core Networking(Handouts).pdfLearn How to Configure Cisco Data Center Core Networking(Handouts).pdf
Learn How to Configure Cisco Data Center Core Networking(Handouts).pdf
Tuan Yang
 
Best Practices to Cybersecurity Vulnerability Management,.pdf
Best Practices to Cybersecurity Vulnerability Management,.pdfBest Practices to Cybersecurity Vulnerability Management,.pdf
Best Practices to Cybersecurity Vulnerability Management,.pdf
Tuan Yang
 
Defense Against Multi-Network Breaches.pdf
Defense Against Multi-Network Breaches.pdfDefense Against Multi-Network Breaches.pdf
Defense Against Multi-Network Breaches.pdf
Tuan Yang
 
Cybersecurity Incident Handling & Response in Under 40 Minutes.pdf
Cybersecurity Incident Handling & Response in Under 40 Minutes.pdfCybersecurity Incident Handling & Response in Under 40 Minutes.pdf
Cybersecurity Incident Handling & Response in Under 40 Minutes.pdf
Tuan Yang
 
An Introduction to CompTIA Security+ - SY0-601.pdf
An Introduction to CompTIA Security+ - SY0-601.pdfAn Introduction to CompTIA Security+ - SY0-601.pdf
An Introduction to CompTIA Security+ - SY0-601.pdf
Tuan Yang
 
CCNP Enterprise Networks Move One Step Closer to Advanced Networking(Handout)...
CCNP Enterprise Networks Move One Step Closer to Advanced Networking(Handout)...CCNP Enterprise Networks Move One Step Closer to Advanced Networking(Handout)...
CCNP Enterprise Networks Move One Step Closer to Advanced Networking(Handout)...
Tuan Yang
 
What is New with CompTIA Network+.pdf
What is New with CompTIA Network+.pdfWhat is New with CompTIA Network+.pdf
What is New with CompTIA Network+.pdf
Tuan Yang
 
What is new with CompTIA PenTest+- PT0 002 - NetCom Learning.pdf
What is new with CompTIA PenTest+- PT0 002 - NetCom Learning.pdfWhat is new with CompTIA PenTest+- PT0 002 - NetCom Learning.pdf
What is new with CompTIA PenTest+- PT0 002 - NetCom Learning.pdf
Tuan Yang
 
Agile Fundamentals One Step Guide for Agile Projects(Handout).pdf
Agile Fundamentals One Step Guide for Agile Projects(Handout).pdfAgile Fundamentals One Step Guide for Agile Projects(Handout).pdf
Agile Fundamentals One Step Guide for Agile Projects(Handout).pdf
Tuan Yang
 
Getting Started with AWS Devops.pdf
Getting Started with AWS Devops.pdfGetting Started with AWS Devops.pdf
Getting Started with AWS Devops.pdf
Tuan Yang
 
Certified Ethical Hacker v11 First Look.pdf
Certified Ethical Hacker v11 First Look.pdfCertified Ethical Hacker v11 First Look.pdf
Certified Ethical Hacker v11 First Look.pdf
Tuan Yang
 
An overview of agile methods and agile project management
An overview of agile methods and agile project management An overview of agile methods and agile project management
An overview of agile methods and agile project management
Tuan Yang
 
The essentials of ccna master the latest principles(handouts)
The essentials of ccna master the latest principles(handouts)The essentials of ccna master the latest principles(handouts)
The essentials of ccna master the latest principles(handouts)
Tuan Yang
 
Unlock the value of itil 4 with 5 key takeaways that can be used today(handout)
Unlock the value of itil 4 with 5 key takeaways that can be used today(handout)Unlock the value of itil 4 with 5 key takeaways that can be used today(handout)
Unlock the value of itil 4 with 5 key takeaways that can be used today(handout)
Tuan Yang
 
CHFI First Look by NetCom Learning - A Free Course on Digital Forensics
CHFI First Look by NetCom Learning - A Free Course on Digital ForensicsCHFI First Look by NetCom Learning - A Free Course on Digital Forensics
CHFI First Look by NetCom Learning - A Free Course on Digital Forensics
Tuan Yang
 
Master Class: Understand the Fundamentals of Architecting on AWS
Master Class: Understand the Fundamentals of Architecting on AWSMaster Class: Understand the Fundamentals of Architecting on AWS
Master Class: Understand the Fundamentals of Architecting on AWS
Tuan Yang
 
How to Deploy Microsoft 365 Apps and Workloads.
How to Deploy Microsoft 365 Apps and Workloads.How to Deploy Microsoft 365 Apps and Workloads.
How to Deploy Microsoft 365 Apps and Workloads.
Tuan Yang
 
Learn to utilize cisco unified communications for better collaboration( hando...
Learn to utilize cisco unified communications for better collaboration( hando...Learn to utilize cisco unified communications for better collaboration( hando...
Learn to utilize cisco unified communications for better collaboration( hando...
Tuan Yang
 
NetCom learning webinar how to manage your projects with disciplined agile (d...
NetCom learning webinar how to manage your projects with disciplined agile (d...NetCom learning webinar how to manage your projects with disciplined agile (d...
NetCom learning webinar how to manage your projects with disciplined agile (d...
Tuan Yang
 
NetCom learning webinar cnd first look by netcom learning - network defender fre
NetCom learning webinar cnd first look by netcom learning - network defender freNetCom learning webinar cnd first look by netcom learning - network defender fre
NetCom learning webinar cnd first look by netcom learning - network defender fre
Tuan Yang
 

More from Tuan Yang (20)

Learn How to Configure Cisco Data Center Core Networking(Handouts).pdf
Learn How to Configure Cisco Data Center Core Networking(Handouts).pdfLearn How to Configure Cisco Data Center Core Networking(Handouts).pdf
Learn How to Configure Cisco Data Center Core Networking(Handouts).pdf
 
Best Practices to Cybersecurity Vulnerability Management,.pdf
Best Practices to Cybersecurity Vulnerability Management,.pdfBest Practices to Cybersecurity Vulnerability Management,.pdf
Best Practices to Cybersecurity Vulnerability Management,.pdf
 
Defense Against Multi-Network Breaches.pdf
Defense Against Multi-Network Breaches.pdfDefense Against Multi-Network Breaches.pdf
Defense Against Multi-Network Breaches.pdf
 
Cybersecurity Incident Handling & Response in Under 40 Minutes.pdf
Cybersecurity Incident Handling & Response in Under 40 Minutes.pdfCybersecurity Incident Handling & Response in Under 40 Minutes.pdf
Cybersecurity Incident Handling & Response in Under 40 Minutes.pdf
 
An Introduction to CompTIA Security+ - SY0-601.pdf
An Introduction to CompTIA Security+ - SY0-601.pdfAn Introduction to CompTIA Security+ - SY0-601.pdf
An Introduction to CompTIA Security+ - SY0-601.pdf
 
CCNP Enterprise Networks Move One Step Closer to Advanced Networking(Handout)...
CCNP Enterprise Networks Move One Step Closer to Advanced Networking(Handout)...CCNP Enterprise Networks Move One Step Closer to Advanced Networking(Handout)...
CCNP Enterprise Networks Move One Step Closer to Advanced Networking(Handout)...
 
What is New with CompTIA Network+.pdf
What is New with CompTIA Network+.pdfWhat is New with CompTIA Network+.pdf
What is New with CompTIA Network+.pdf
 
What is new with CompTIA PenTest+- PT0 002 - NetCom Learning.pdf
What is new with CompTIA PenTest+- PT0 002 - NetCom Learning.pdfWhat is new with CompTIA PenTest+- PT0 002 - NetCom Learning.pdf
What is new with CompTIA PenTest+- PT0 002 - NetCom Learning.pdf
 
Agile Fundamentals One Step Guide for Agile Projects(Handout).pdf
Agile Fundamentals One Step Guide for Agile Projects(Handout).pdfAgile Fundamentals One Step Guide for Agile Projects(Handout).pdf
Agile Fundamentals One Step Guide for Agile Projects(Handout).pdf
 
Getting Started with AWS Devops.pdf
Getting Started with AWS Devops.pdfGetting Started with AWS Devops.pdf
Getting Started with AWS Devops.pdf
 
Certified Ethical Hacker v11 First Look.pdf
Certified Ethical Hacker v11 First Look.pdfCertified Ethical Hacker v11 First Look.pdf
Certified Ethical Hacker v11 First Look.pdf
 
An overview of agile methods and agile project management
An overview of agile methods and agile project management An overview of agile methods and agile project management
An overview of agile methods and agile project management
 
The essentials of ccna master the latest principles(handouts)
The essentials of ccna master the latest principles(handouts)The essentials of ccna master the latest principles(handouts)
The essentials of ccna master the latest principles(handouts)
 
Unlock the value of itil 4 with 5 key takeaways that can be used today(handout)
Unlock the value of itil 4 with 5 key takeaways that can be used today(handout)Unlock the value of itil 4 with 5 key takeaways that can be used today(handout)
Unlock the value of itil 4 with 5 key takeaways that can be used today(handout)
 
CHFI First Look by NetCom Learning - A Free Course on Digital Forensics
CHFI First Look by NetCom Learning - A Free Course on Digital ForensicsCHFI First Look by NetCom Learning - A Free Course on Digital Forensics
CHFI First Look by NetCom Learning - A Free Course on Digital Forensics
 
Master Class: Understand the Fundamentals of Architecting on AWS
Master Class: Understand the Fundamentals of Architecting on AWSMaster Class: Understand the Fundamentals of Architecting on AWS
Master Class: Understand the Fundamentals of Architecting on AWS
 
How to Deploy Microsoft 365 Apps and Workloads.
How to Deploy Microsoft 365 Apps and Workloads.How to Deploy Microsoft 365 Apps and Workloads.
How to Deploy Microsoft 365 Apps and Workloads.
 
Learn to utilize cisco unified communications for better collaboration( hando...
Learn to utilize cisco unified communications for better collaboration( hando...Learn to utilize cisco unified communications for better collaboration( hando...
Learn to utilize cisco unified communications for better collaboration( hando...
 
NetCom learning webinar how to manage your projects with disciplined agile (d...
NetCom learning webinar how to manage your projects with disciplined agile (d...NetCom learning webinar how to manage your projects with disciplined agile (d...
NetCom learning webinar how to manage your projects with disciplined agile (d...
 
NetCom learning webinar cnd first look by netcom learning - network defender fre
NetCom learning webinar cnd first look by netcom learning - network defender freNetCom learning webinar cnd first look by netcom learning - network defender fre
NetCom learning webinar cnd first look by netcom learning - network defender fre
 

Recently uploaded

amptalk_RecruitingDeck_english_2024.06.05
amptalk_RecruitingDeck_english_2024.06.05amptalk_RecruitingDeck_english_2024.06.05
amptalk_RecruitingDeck_english_2024.06.05
marketing317746
 
Authentically Social by Corey Perlman - EO Puerto Rico
Authentically Social by Corey Perlman - EO Puerto RicoAuthentically Social by Corey Perlman - EO Puerto Rico
Authentically Social by Corey Perlman - EO Puerto Rico
Corey Perlman, Social Media Speaker and Consultant
 
Project File Report BBA 6th semester.pdf
Project File Report BBA 6th semester.pdfProject File Report BBA 6th semester.pdf
Project File Report BBA 6th semester.pdf
RajPriye
 
Exploring Patterns of Connection with Social Dreaming
Exploring Patterns of Connection with Social DreamingExploring Patterns of Connection with Social Dreaming
Exploring Patterns of Connection with Social Dreaming
Nicola Wreford-Howard
 
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdf
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdfSearch Disrupted Google’s Leaked Documents Rock the SEO World.pdf
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdf
Arihant Webtech Pvt. Ltd
 
LA HUG - Video Testimonials with Chynna Morgan - June 2024
LA HUG - Video Testimonials with Chynna Morgan - June 2024LA HUG - Video Testimonials with Chynna Morgan - June 2024
LA HUG - Video Testimonials with Chynna Morgan - June 2024
Lital Barkan
 
Digital Transformation and IT Strategy Toolkit and Templates
Digital Transformation and IT Strategy Toolkit and TemplatesDigital Transformation and IT Strategy Toolkit and Templates
Digital Transformation and IT Strategy Toolkit and Templates
Aurelien Domont, MBA
 
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdfMeas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
dylandmeas
 
Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...
dylandmeas
 
Premium MEAN Stack Development Solutions for Modern Businesses
Premium MEAN Stack Development Solutions for Modern BusinessesPremium MEAN Stack Development Solutions for Modern Businesses
Premium MEAN Stack Development Solutions for Modern Businesses
SynapseIndia
 
Building Your Employer Brand with Social Media
Building Your Employer Brand with Social MediaBuilding Your Employer Brand with Social Media
Building Your Employer Brand with Social Media
LuanWise
 
Buy Verified PayPal Account | Buy Google 5 Star Reviews
Buy Verified PayPal Account | Buy Google 5 Star ReviewsBuy Verified PayPal Account | Buy Google 5 Star Reviews
Buy Verified PayPal Account | Buy Google 5 Star Reviews
usawebmarket
 
BeMetals Investor Presentation_June 1, 2024.pdf
BeMetals Investor Presentation_June 1, 2024.pdfBeMetals Investor Presentation_June 1, 2024.pdf
BeMetals Investor Presentation_June 1, 2024.pdf
DerekIwanaka1
 
The effects of customers service quality and online reviews on customer loyal...
The effects of customers service quality and online reviews on customer loyal...The effects of customers service quality and online reviews on customer loyal...
The effects of customers service quality and online reviews on customer loyal...
balatucanapplelovely
 
ikea_woodgreen_petscharity_dog-alogue_digital.pdf
ikea_woodgreen_petscharity_dog-alogue_digital.pdfikea_woodgreen_petscharity_dog-alogue_digital.pdf
ikea_woodgreen_petscharity_dog-alogue_digital.pdf
agatadrynko
 
Call 8867766396 Satta Matka Dpboss Matka Guessing Satta batta Matka 420 Satta...
Call 8867766396 Satta Matka Dpboss Matka Guessing Satta batta Matka 420 Satta...Call 8867766396 Satta Matka Dpboss Matka Guessing Satta batta Matka 420 Satta...
Call 8867766396 Satta Matka Dpboss Matka Guessing Satta batta Matka 420 Satta...
bosssp10
 
Understanding User Needs and Satisfying Them
Understanding User Needs and Satisfying ThemUnderstanding User Needs and Satisfying Them
Understanding User Needs and Satisfying Them
Aggregage
 
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
Boris Ziegler
 
FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134
LR1709MUSIC
 
Auditing study material for b.com final year students
Auditing study material for b.com final year  studentsAuditing study material for b.com final year  students
Auditing study material for b.com final year students
narasimhamurthyh4
 

Recently uploaded (20)

amptalk_RecruitingDeck_english_2024.06.05
amptalk_RecruitingDeck_english_2024.06.05amptalk_RecruitingDeck_english_2024.06.05
amptalk_RecruitingDeck_english_2024.06.05
 
Authentically Social by Corey Perlman - EO Puerto Rico
Authentically Social by Corey Perlman - EO Puerto RicoAuthentically Social by Corey Perlman - EO Puerto Rico
Authentically Social by Corey Perlman - EO Puerto Rico
 
Project File Report BBA 6th semester.pdf
Project File Report BBA 6th semester.pdfProject File Report BBA 6th semester.pdf
Project File Report BBA 6th semester.pdf
 
Exploring Patterns of Connection with Social Dreaming
Exploring Patterns of Connection with Social DreamingExploring Patterns of Connection with Social Dreaming
Exploring Patterns of Connection with Social Dreaming
 
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdf
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdfSearch Disrupted Google’s Leaked Documents Rock the SEO World.pdf
Search Disrupted Google’s Leaked Documents Rock the SEO World.pdf
 
LA HUG - Video Testimonials with Chynna Morgan - June 2024
LA HUG - Video Testimonials with Chynna Morgan - June 2024LA HUG - Video Testimonials with Chynna Morgan - June 2024
LA HUG - Video Testimonials with Chynna Morgan - June 2024
 
Digital Transformation and IT Strategy Toolkit and Templates
Digital Transformation and IT Strategy Toolkit and TemplatesDigital Transformation and IT Strategy Toolkit and Templates
Digital Transformation and IT Strategy Toolkit and Templates
 
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdfMeas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
Meas_Dylan_DMBS_PB1_2024-05XX_Revised.pdf
 
Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...Discover the innovative and creative projects that highlight my journey throu...
Discover the innovative and creative projects that highlight my journey throu...
 
Premium MEAN Stack Development Solutions for Modern Businesses
Premium MEAN Stack Development Solutions for Modern BusinessesPremium MEAN Stack Development Solutions for Modern Businesses
Premium MEAN Stack Development Solutions for Modern Businesses
 
Building Your Employer Brand with Social Media
Building Your Employer Brand with Social MediaBuilding Your Employer Brand with Social Media
Building Your Employer Brand with Social Media
 
Buy Verified PayPal Account | Buy Google 5 Star Reviews
Buy Verified PayPal Account | Buy Google 5 Star ReviewsBuy Verified PayPal Account | Buy Google 5 Star Reviews
Buy Verified PayPal Account | Buy Google 5 Star Reviews
 
BeMetals Investor Presentation_June 1, 2024.pdf
BeMetals Investor Presentation_June 1, 2024.pdfBeMetals Investor Presentation_June 1, 2024.pdf
BeMetals Investor Presentation_June 1, 2024.pdf
 
The effects of customers service quality and online reviews on customer loyal...
The effects of customers service quality and online reviews on customer loyal...The effects of customers service quality and online reviews on customer loyal...
The effects of customers service quality and online reviews on customer loyal...
 
ikea_woodgreen_petscharity_dog-alogue_digital.pdf
ikea_woodgreen_petscharity_dog-alogue_digital.pdfikea_woodgreen_petscharity_dog-alogue_digital.pdf
ikea_woodgreen_petscharity_dog-alogue_digital.pdf
 
Call 8867766396 Satta Matka Dpboss Matka Guessing Satta batta Matka 420 Satta...
Call 8867766396 Satta Matka Dpboss Matka Guessing Satta batta Matka 420 Satta...Call 8867766396 Satta Matka Dpboss Matka Guessing Satta batta Matka 420 Satta...
Call 8867766396 Satta Matka Dpboss Matka Guessing Satta batta Matka 420 Satta...
 
Understanding User Needs and Satisfying Them
Understanding User Needs and Satisfying ThemUnderstanding User Needs and Satisfying Them
Understanding User Needs and Satisfying Them
 
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
Agency Managed Advisory Board As a Solution To Career Path Defining Business ...
 
FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134FINAL PRESENTATION.pptx12143241324134134
FINAL PRESENTATION.pptx12143241324134134
 
Auditing study material for b.com final year students
Auditing study material for b.com final year  studentsAuditing study material for b.com final year  students
Auditing study material for b.com final year students
 

BIG DATA | How to explain it & how to use it for your career?

  • 1. BIG DATA | How to explain it & how to use it for your career?
  • 3. NetCom Learning – Managed Learning Services
  • 4.
  • 5. Today’s Agenda If you ask people what BIG DATA is they often say it is about a lot of data. But the world has ALWAYS had a lot of data! It is about datafication – a word so new that even spellcheck functions don’t know it’s a real word! Today’s Agenda  How BIG DATA changes career paths of even the most unsuspecting!  How BIG DATA changes the way business decision are made.  How BIG DATA changes who makes the decisions & the reshuffling balance of power.  What BIG DATA skills can you bring to the office tomorrow to increase your value.
  • 6. The experienced Data scientists & those managers who leverage them. BIG DATA is a management tool even if you have other employees perform the coding. BIG DATA is as ubiquitous as the internet. Gut instinct now of less value
  • 7. Datafication A modern technological trend turning many aspects of our life into computerized data that transforms respective information into new forms of value.
  • 10. A Metaphor / Illustration
  • 12. activity or purpose natural to or intended for a person or thing. relationship or expression involving one or more variables.
  • 14.
  • 15. Just as voice mail and email obviated the manager’s need of secretarial functions  algorithms eating BIG DATA are now obviating tactical managerial functions. Transactional Work Tactical Work
  • 16. Strategy needs to consume data. Data, without strategy, has little value.
  • 17. Modified sine wave Sine wave What is the difference between analogue and digital? Datafication only possible due to digitalization of analogue informaton.
  • 19. Interprets continuous sine wave as a digital recreation.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28. This photo was taken on film – not a digital camera.
  • 29.
  • 30. Are there data points within this “single” data point? Social Construct
  • 33. Big data: broad term for data sets so large & complex that traditional data processing applications are inadequate.
  • 34. A terabyte, petabyte & gigabyte walk into a bar...
  • 36. Yottabyte is 1,000 trillion gigabytes Giga Tera Peta Exa Zetta Yotta Mega Kilo
  • 37. The Least You Need to Know About BIG DATA BIG DATA manifests 3 basic shifts:  From Small to All  Clean to Messy  Causation to Correlation V. Suppa The Definitive 90 Thousand Foot Lecture on BIG Data© 2014
  • 38. Scope of Traditional Data  Data growth analogous to y = tan x.  In 2000, ¼ of world’s information digital; reminder preserved in analog.  digital data doubles around every 3 years  In 2014 less than 2% of all stored information is analog. (And now we’re in 2017!) The Definitive 90 Thousand Foot Lecture on BIG Data© 2014
  • 39. Big Data is Not About Lots of Data  Lots of data existed before Big Data!  Big Data: ability to render aspects of life into data points never quantified before.  This is DATAFICATION … your new word of the day! V.The Definitive 90 Thousand Foot Lecture on BIG Data© 2014
  • 40. DATAFICATION . V.pa The Definitive 90 Thousand Foot Lecture on BIG Data© 2014 Location was datafied before GPS was invented  Words treated as data.  Friendships & likes datafied, via Facebook
  • 41.  Shigeomi Koshimizu datafied body contour (body, posture, weight distribution, etc.).  Quantified “sitting down.” Measured pressure drivers exert at 360 different points via sensors (0 to 256 scale). Quality  Quantify Datafication Turns Everything into a Data Point
  • 42. Tools of Datafication  inexpensive computers (commodity)  powerful processors (commodity)  basic statistics (commodity)  clever software (commodity)  smart algorithm (differentiator)
  • 43. Lots of Data versus BIG DATA Computers computing lots of data: Teaching computer to translate by inputting bilingual dictionaries Computers computing BIG DATA Feed computer years of Canadian parliamentary transcripts French / English) Statically program it to infer which word of English is best alternative to French The Definitive 90 Thousand Foot Lecture on BIG Data© 2014 In context, French word lumiere more appropriate substitute for the English work light than leger. Isn’t this how a person translates?
  • 44. A Quick Review & then … Causation to Correlation  sampling population  entire population  pristine data  non curated messy data  causation  correlation Reasons on how the world works replaced with learning about association among phenomena  Knowing cause “is” desirable.  But cause is harder to figure out  Cause as illusion? Cognitive bias V The Definitive 90 Thousand Foot Lecture on BIG Data© 2014
  • 45. Saving Trucks Saving Babies Saving Epidemics Saving Buildings
  • 46. Place sensors on parts to identify heat & vibrational patterns associated with failures leading to breakdowns. Can predict a breakdown before it happens & replace parts in garage & not on side of the road.  Data does not tell us why the part is in trouble  It reveals enough to know the what  Can guide investigations into discovering underlying cause Causation to Correlation
  • 47.  When saving lives, knowing something is likely to occur more important than knowing why.  Eventually, “the why” will be investigated.
  • 48. Can Big Data Save Babies? Used Big Data to spot infections in premature babies before symptoms appear.  Information flow >1000 data points per second  Discovered correlations between very minor changes and more serious problems
  • 49. Big Data Predicts Epidemics Better than CDC CDC tracks patient visits to clinics Information suffers from 2 week reporting lag Google took 50 mm most commonly searched terms from 2003 – 2008 Compared them against historical influenza data from CDC. Searches then correlated with CDC’s data on outbreaks of flu.
  • 50. How All Three Shifts Are Illustrated Small to All Ran 100% of US searches for 6 years through an algorithm identified 45 searches correlated against CDC data on flu outbreak (runny nose, body aches, etc. - ). Clean to Messy Searches imperfect with misspellings, incomplete phrases & included healthy people searching on behalf of others. Causation to Correlation Will anyone claim typing symptoms in a search engine gives you the flu? Big Data via searches predicts outbreaks real time compared to CDC’s traditional data analytics that lag 2 week lag
  • 51. Illegally subdivided buildings more likely to catch fire. 200 inspectors to respond to 25K complaints / year wrt overcrowded buildings.
  • 52. NYC created database of 900K buildings augmented by troves of data collected by 19 agencies: • Records of tax liens • Anomalies in utility usage • Service cuts • Missed payments • Ambulance visits • Local crime rates • Rodent complaints • Etc. Big Data increases the productivity of each inspector
  • 53. How Did They Do It? 1. Compared database (5 years of building fires) 2. Ranked by severity 3. Observed correlation. (Not causality!) 4. Data scientists triaged complaints for inspections. Concluded that a building’s:  type & age main predictor of fire; other variables superfluous  permit for exterior brickwork correlated lower risk of fire. Result: Vacate orders increased from 13% to 70% Building characteristics did not cause fire but were correlated with fire risk.
  • 54. Spending money on the exterior correlates for an up to code interior But just the intent to begin work correlates enough to predict an outcome
  • 55.
  • 56. Pull disparate sets of texts & puts them into a “point of singularity.” Currently ae 70% of data is text. Pictures to be quantified under separate protocols.Create a Corpus  body of text to be analyzed. R, for example, has set of functions to clean up a Corpus by excluding data points superfluous to analysis. (Delete commas, periods & words such as but & and, etc. – R cleans up files by reducing corpus to primary words crucial to analysis. Truncates words with common stem  this is called stemming. (e.g. engineer & engineering both become the same word. Think of mathematical analogy of number factoring versus least common dominator. 1 2 3
  • 57. 4Mathematical matrix to describes frequency of terms that occur in a collection of documents. Rows correspond to documents in the collection & columns correspond to terms. Create a document term matrix that measures frequency of words that remain after corpus “cleanup” discussed in previous slide. 4 You are left with primary outputs that enable you to do counts in each cell. You’ve datafied or quantified words that others only qualify that prevents analysis. You can now do lots of interesting stuff! Term document matrix cluster analysis reveals prevalent themes. Document-term matrix
  • 58. Cluster analysis  review at how all your words cluster in your data matrix cluster. The result of this analysis is that we can reduce our matrix to fewer columns. Font Size & even Color embedded with information. This information is actionable.
  • 59. For centuries we have manually counted sets of words to determining their frequencies. Zipf's law states that given some corpus of natural language utterances, the frequency of any word is inversely proportional to its rank in the frequency table. Used for resumes as a way to increase information density – to be covered at a future webinar.
  • 60.  With these data sets, we can run sentiment analysis!  Determine occurrence rate of certain themes qualified as opinions.  To determine if people like a restaurant we’d look at words reviewers used via social media in the comment section. Love 10 Hate -10 Dislike - 7 Qualitatively, we quantify the weakness or strength of these signals. We determine words that correlate to having disliked or liked the movie and to what degree along a predetermined discreet continuum . Pre-establish words in narrative responses now embedded in clusters signal positive or negative statements about a movie, restaurant or Hammacher Schlemme customer review. Like 7
  • 61. The difference between analog and digital signals is that an analog signal is a continuous electrical message while digital is a series of values that represent information.
  • 62. To determinate what traits can predict future outcomes, look at historical data. Correlate “judgements” to see if they can predict from groupings, meaning which ones predict against other dataset. This is cross validation and is determined by looking at historical data sets. Master Algorithms script other algorithms on an at need basis free of human interaction. Machine to machine (M2M) technology that enables networked devices to exchange information & perform actions without the manual assistance of humans. This is what is replacing traditional managerial jobs. Firms that still employ these types of jobs feel less pressure to keep salaries at pace with inflation over time.
  • 63. Machine learning can test statistical models. ….. for example, testing against known political party membership & updating the algorithm as new data comes in. In M2M, we let data points come in, refresh & update to automatically script even more accurate algorithm. Can infer your political affliction by first 19 likes even if those likes are completely apolitical.
  • 64. What Can I Do Tomorrow Morning at the Office? 1. Take inventory of the data you already collect A. Internal data. B. External data accessed from FOI Act – to be discuss subsequently. C. External data legally purchased from vendors (Yelp, FB, Double Click, etc.) - D. Create glossary of data definition. (headcount example) 2. Determine decisions to derive from Big Data A. Select most pressing problem based on Pareto 80/20 rule. B. In plain English, state your problem statement. C. Write down independent variables (inventory set of data at your disposal.) D. Determine dependent variable (preferred outcome to your problem statement.) 3. Write down your hypothesis 4. Contact your IT or data science department. If not ….. 5. Contract STEM grad students & turn them into data scientists 6. Code your hypothesis Even if I hate coding and math! QuantitativeSkills
  • 65. The Freedom of Information Act (FOIA), 5 U.S.C. § 552, is a federal law that allows for disclosure of previously unreleased information controlled by the US government. Correlate to external data with troves of data from US gov’t. (Examples: MTA apps)! Enacted in 1966, allows U.S. citizens to petition government for official information.
  • 66. Business problem you are trying to solve in plain language stated as a problem statement State it in a hypothesis. Collect Data, from systems already set in place. Test hypothesis
  • 67.
  • 68. Coding is the new literacy. Coding Classes. Most are on-line, a few on-site. Some free & some at cost. Most of you will not be competing with other coders – just other Marketing, HR or Financial professionals who know nothing about coding!
  • 69. Should I learn to read? Should I learn how to use the internet? Should I learn about coding?
  • 70.
  • 71.
  • 72. A little about R• R – Free • Contains embedded tools to pull external data • Tools that scrape data from any website, (Reuters, as one example) • Text Mining: Knime (another software tool for text mining) – you can download it. (pronounced like 9 but with a “m”. Has graphical interface instead of using a scripting language.) • Remember, Word Clouds is an example of text mining. • R was written in C language – coders wrote functions in “C” to create macros in R to pull data - analogous to a macro in excel. • R will let you pull data into a corpus. KNIME - Konstanz Information Miner  open source data analytics, reporting & integration platform. It integrates various components for machine learning & data mining.
  • 73. You’re not competing against other coders. You’re competing against others in your field that know nothing about coding.
  • 74. Facebook accomplished what democratic gov’t tried but failed to do – build a database of citizens.
  • 75. Datafication turns all aspects of life & turns it into data. Google’s augmented reality glasses datafy the gaze Twitter datafies stray thoughts LinkedIn datafied professional networks
  • 76. The Floor as a Giant iPad via surface based computing technology V. The Definitive 90 Thousand Foot Lecture on BIG Data© 2014 No thank you, I’m just looking . That’s okay, I’m datafying your every move. Touch sensitive floor customers walk on
  • 77. Download a Coupon Texted to You • What aisles did you walk down or ignore? • In what sequence did you browse the aisles? • How long were you in the store? • What is length of time between store visits? • How long did you linger in front of the cereal aisle? • When you checked out, did cereal wind up your cart? How many boxes? Compare viewing patterns with what wound up in your shopping cart. Script algorithms to better predicts independent variables (what they stock) with the depend variable of revenue thresholds.
  • 78. So, what’s my role again in a Big Data World? As Big Data becomes ubiquitous what skills mark points of differentiations?  Discovering latent needs & intuition that goes against the facts?  The mere ability to define a problem proceeds its solution Big Data has a quantitative & qualitative side And if you hate math - qualitative skills to harness  Develop observational skills to separate signal from the noise  Take inventory of existing data  Learn to develop hypotheses to test  Learn how to access external data (FOIA. LinkedIn, etc. - )  Liaison between internal ERP data & external data  Network with STEM student to contract data scientists
  • 79. Your Role in a Big Data World If Ford queried BIG DATA to discover what customers want, he’d come up with faster horses who required less water. In Big Data world, traits to be developed:  Creativity  Intuition  Intellectual curiosity  Leveraging errors  Risk taking V The Definitive 90 Thousand Foot Lecture on BIG Data© 2014 Read outside your discipline
  • 80.
  • 81. non sequitur Vincent Suppa © 2016 Don’t be afraid to fail. Business is not figure skating. It’s the X games! If you fail, have quick-to-market failures that mitigate loss & allow you to harvest what did work for the next initiative. Capitalism without failure is like a religion without sin.
  • 82. Recommended Courses NetCom Learning offers a comprehensive portfolio for Big Data training options. Please see below the list of recommended courses with upcoming schedules: Introduction to Python Programming Essential Python Introduction to Python Scripting: for the Security Analyst Check out more Big Data training options with NetCom Learning. CLICK HERE
  • 83. Our live webinars will help you to touch base a wide variety of IT, soft skills and business productivity topics; and keep you up to date on the latest IT industry trends. Register now for our upcoming webinars: A Brief on Benefits of ITIL for the Organization – April 4 Visualization with Tableau to Enhance Efficiency in Organization – April 6 How Machine Learning Helps Organizations to Work More Efficiently? – April 11 Why Certified Associate in Project Management (CAPM) and How to Prepare? - April 18 A Brief About DevOps and its Practices – April 20
  • 84. Special Promotion Whether you're learning new IT or Business skills, or you are developing a learning plan for your team, for limited time, register for our Guarantee to Run classes and get 25% off on the course price. Learn more»
  • 85. To get latest technology updates, please follow our social media pages!
  • 86.

Editor's Notes

  1. Words now treated as data when computers mine century’s worth of books. Even friendship and “likes” are datafied, via Facebook
  2. It can infer the probability that a traffic light is green and not red “or”
  3. 65 slides as of October 24 2016 -