SlideShare a Scribd company logo
Big Data &
Data Science
W's
Emanuele Della Valle
@manudellavalle
Prof. @polimi & Founder @fluxedo_
W's
18/06/2018 @manudellavalle - http://emanueledellavalle.org 2
Why?
• In many organizations decisions are made by
"questionable" methodologies such as
– Highest Paid Person Opinion (HiPPO)
– Flipism (all decisions are made by flipping a coin)
18/06/2018 @manudellavalle - http://emanueledellavalle.org 3
Why?
Highest Paid Person Opinion (HiPPO)
18/06/2018 @manudellavalle - http://emanueledellavalle.org 4
Why?
Flipism (all decisions are made by flipping a coin)
18/06/2018 @manudellavalle - http://emanueledellavalle.org 5
Why?
• In many organizations decisions are made by the
"questionable" methodologies such as
– Highest Paid Person Opinion (HiPPO)
– Flipism (all decisions are made by flipping a coin)
• This could have been the right approach in the '70s …
– See the "Theory of Bounded Rationality" by Herbert Simons
18/06/2018 @manudellavalle - http://emanueledellavalle.org 6
Why?
18/06/2018 @manudellavalle - http://emanueledellavalle.org
[source http://www.azquotes.com/quote/139996 ]
7
Why?
• In many organizations decisions are made by the
"questionable" methodologies such as
– Highest Paid Person Opinion (HiPPO)
– Flipism (all decisions are made by flipping a coin)
• This could have been the right approach in the '70s …
– See the "Theory of Bounded Rationality" by Herbert Simons
• … but in the Big Data era one can dream of
data-driven organization
18/06/2018 @manudellavalle - http://emanueledellavalle.org 8
Why?
• Data-Driven Organization
18/06/2018 @manudellavalle - http://emanueledellavalle.org 9
Why?
Decisions no longer have to be made in the dark
or based on gut instinct; they can be based on
evidence, experiments and more accurate
forecasts.
-- McKinsey
18/06/2018 @manudellavalle - http://emanueledellavalle.org 10
Why?
• Data-driven organizations
– perform better
• The data shows where they can streamline their processes
– are operationally more predictable
• Data insights fuel current and future decision making
– are more profitable
• Constant improvements and better predictions help to
outsmart the competition and improve innovation.
18/06/2018 @manudellavalle - http://emanueledellavalle.org 11
Why?
• Moneyball: data + analysis to win games
18/06/2018 @manudellavalle - http://emanueledellavalle.org
[source: https://www.imdb.com/title/tt1210166/ ]
12
What's Big Data?
[source: IBM, 2012]
18/06/2018 @manudellavalle - http://emanueledellavalle.org 13
What's Big Data?
[source: IBM, 2012]
18/06/2018 @manudellavalle - http://emanueledellavalle.org 14
What's Big Data?
[source: IBM, 2012]
18/06/2018 @manudellavalle - http://emanueledellavalle.org 15
What's Big Data?
[source: IBM, 2012]
18/06/2018 @manudellavalle - http://emanueledellavalle.org 16
What's Big Data?
[source: IBM, 2012]
18/06/2018 @manudellavalle - http://emanueledellavalle.org 17
What's Big Data?
• Big Data is "crude oil" … that we have to
– Extract
– Transport in mega-tankers
– Ship through pipelines
– Store in massive silos
– …
18/06/2018 @manudellavalle - http://emanueledellavalle.org 18
What's Data Science?
• Data Science is "refining crude oil"
18/06/2018 @manudellavalle - http://emanueledellavalle.org
[source:http://allabtinstru.blogspot.com/2016/09/ProcessofRefiningCrudeOil.html]
19
What's Data Science?
• The Science [and Art] of…
– Discovering what we don’t know from data
– Obtaining predictive, actionable insight from data
– Creating Data Products that have business impact
now
– Communicating relevant business stories from data
– Building confidence in decisions that drive business
value
18/06/2018 @manudellavalle - http://emanueledellavalle.org 20
Who's a Data Scientist?
• Drew Conway, 2010
18/06/2018 @manudellavalle - http://emanueledellavalle.org 21
How?
• Statistics starts with data
• Two goals of analyzing data
– Descriptions: how nature associates responses to inputs
– Predictions: response for future input variables
[source: Statistical Modeling: The Two Cultures. Leo Breiman, 2001]
18/06/2018 @manudellavalle - http://emanueledellavalle.org
nature xy
independent
variable
response
variable
22
How?
[source: Marc Andrews, 2014]
Leverage more of the data being captured
18/06/2018 @manudellavalle - http://emanueledellavalle.org 23
How?
[source: Marc Andrews, 2014]
Leverage more of the data being captured
18/06/2018 @manudellavalle - http://emanueledellavalle.org 24
How?
[source: Marc Andrews, 2014]
Leverage more of the data being captured
18/06/2018 @manudellavalle - http://emanueledellavalle.org 25
How?
18/06/2018 @manudellavalle - http://emanueledellavalle.org
Reduce effort required to leverage data
[source: Marc Andrews, 2014]
26
How?
18/06/2018 @manudellavalle - http://emanueledellavalle.org
Reduce effort required to leverage data
[source: Marc Andrews, 2014]
27
What?
18/06/2018 @manudellavalle - http://emanueledellavalle.org
Reduce effort required to leverage data
[source: Marc Andrews, 2014]
28
How?
Data-driven exploration looking for correlation
18/06/2018 @manudellavalle - http://emanueledellavalle.org
[source: Marc Andrews, 2014]
29
How?
Data-driven exploration looking for correlation
18/06/2018 @manudellavalle - http://emanueledellavalle.org
[source: Marc Andrews, 2014]
30
Your butcher …
18/06/2018 @manudellavalle - http://emanueledellavalle.org 31
… at scale!
18/06/2018 @manudellavalle - http://emanueledellavalle.org 32
How?
Leverage data as it is captured
18/06/2018 @manudellavalle - http://emanueledellavalle.org
[source: Marc Andrews, 2014]
33
How?
Leverage data as it is captured
18/06/2018 @manudellavalle - http://emanueledellavalle.org
[source: Marc Andrews, 2014]
34
How?
Leverage data as it is captured
18/06/2018 @manudellavalle - http://emanueledellavalle.org
[source: Marc Andrews, 2014]
35
How?
18/06/2018 @manudellavalle - http://emanueledellavalle.org
[sourcehttps://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/]
36
How?
Overall picture by Gartner
18/06/2018 @manudellavalle - http://emanueledellavalle.org 37
Where?
18/06/2018 @manudellavalle - http://emanueledellavalle.org
[source https://www.ted.com/talks/anne_milgram_why_smart_statistics_are_the_key_to_fighting_crime ]
Improve public safety and
reduce violent crime
through data analytics
-41% murders | -27% crimes
38
Where?
18/06/2018 @manudellavalle - http://emanueledellavalle.org 39
Where?
18/06/2018 @manudellavalle - http://emanueledellavalle.org 40
What about cybersec?
18/06/2018 @manudellavalle - http://emanueledellavalle.org 41
Credits
• Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Carlos Somohano, 2013
– https://www.slideshare.net/datasciencelondon/big-data-sorry-data-
science-what-does-a-data-scientist-do-world
• Becoming a data-driven organization The what, why and how.
SAS, 2018
– https://www.sas.com/en_us/whitepapers/becoming-data-driven-
organization-109150.html
• Never trust summary statistics alone; always visualize your data.
Alberto Cairo, 2016
– http://www.thefunctionalart.com/2016/08/download-datasaurus-
never-trust-summary.html
• 2017 Planning Guide for Data and Analytics. John Hagerty
(Gartner), 2016
– https://www.gartner.com/binaries/content/assets/events/keywords/
catalyst/catus8/2017_planning_guide_for_data_analytics.pdf
18/06/2018 @manudellavalle - http://emanueledellavalle.org 42
Thank you!
Any Question?
Emanuele Della Valle
@manudellavalle
Prof. @polimi & Founder @fluxedo_

More Related Content

Similar to Big Data and Data Science W's

Bit120 m02 l02 - valuing information
Bit120   m02 l02 - valuing informationBit120   m02 l02 - valuing information
Bit120 m02 l02 - valuing information
NeumontStudio
 
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
Analytics India Magazine
 
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
Analytics India Magazine
 
Data_Mining.ppt
Data_Mining.pptData_Mining.ppt
Data_Mining.ppt
PerumalPitchandi
 
Art of Science : choosing the best graphical representation to make decision
Art of Science : choosing the best graphical representation to make decisionArt of Science : choosing the best graphical representation to make decision
Art of Science : choosing the best graphical representation to make decision
beloret
 
Big data: why, what, paradigm shifts enabled , tools and market landscape
Big data: why, what, paradigm shifts enabled , tools and market landscapeBig data: why, what, paradigm shifts enabled , tools and market landscape
Big data: why, what, paradigm shifts enabled , tools and market landscape
Emanuele Della Valle
 
Engage 2017 Watson Analytics - Socialytics, accelerating IBM Connections ado...
Engage 2017  Watson Analytics - Socialytics, accelerating IBM Connections ado...Engage 2017  Watson Analytics - Socialytics, accelerating IBM Connections ado...
Engage 2017 Watson Analytics - Socialytics, accelerating IBM Connections ado...
Femke Goedhart
 
Impact of big data on analytics
Impact of big data on analyticsImpact of big data on analytics
Impact of big data on analytics
Capgemini
 
How To Activate Employee Engagement Through Digital Transformation
How To Activate Employee Engagement Through Digital TransformationHow To Activate Employee Engagement Through Digital Transformation
How To Activate Employee Engagement Through Digital Transformation
Dynamic Signal
 
Data Driven Growth - Amplitude London Product Analytics Summit
Data Driven Growth - Amplitude London Product Analytics SummitData Driven Growth - Amplitude London Product Analytics Summit
Data Driven Growth - Amplitude London Product Analytics Summit
Andy Young
 
Social Connections 14 - Watson Analytics: accelerate your Connections adoption
Social Connections 14 - Watson Analytics: accelerate your Connections adoptionSocial Connections 14 - Watson Analytics: accelerate your Connections adoption
Social Connections 14 - Watson Analytics: accelerate your Connections adoption
panagenda
 
Applying Data Quality Best Practices at Big Data Scale
Applying Data Quality Best Practices at Big Data ScaleApplying Data Quality Best Practices at Big Data Scale
Applying Data Quality Best Practices at Big Data Scale
Precisely
 
Using Data to Inform Information Architecture and User Experience
Using Data to Inform Information Architecture and User ExperienceUsing Data to Inform Information Architecture and User Experience
Using Data to Inform Information Architecture and User Experience
Elementive
 
"Data Informed vs Data Driven" by Casper Sermsuksan (Kulina)
"Data Informed vs Data Driven" by Casper Sermsuksan (Kulina)"Data Informed vs Data Driven" by Casper Sermsuksan (Kulina)
"Data Informed vs Data Driven" by Casper Sermsuksan (Kulina)
Tech in Asia ID
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Simplilearn
 
big data analytics pgpmx2015
big data analytics pgpmx2015big data analytics pgpmx2015
big data analytics pgpmx2015
Sanmeet Dhokay
 
Modern Metadata Strategies
Modern Metadata StrategiesModern Metadata Strategies
Modern Metadata Strategies
DATAVERSITY
 
Data-driven Growth - Analytics & Attribution for Marketers in 2016 | Turing F...
Data-driven Growth - Analytics & Attribution for Marketers in 2016 | Turing F...Data-driven Growth - Analytics & Attribution for Marketers in 2016 | Turing F...
Data-driven Growth - Analytics & Attribution for Marketers in 2016 | Turing F...
Andy Young
 
Andy Young — Data-Driven Growth: Analytics Tools and Tips for Marketers in 20...
Andy Young — Data-Driven Growth: Analytics Tools and Tips for Marketers in 20...Andy Young — Data-Driven Growth: Analytics Tools and Tips for Marketers in 20...
Andy Young — Data-Driven Growth: Analytics Tools and Tips for Marketers in 20...
Turing Fest
 

Similar to Big Data and Data Science W's (20)

Bit120 m02 l02 - valuing information
Bit120   m02 l02 - valuing informationBit120   m02 l02 - valuing information
Bit120 m02 l02 - valuing information
 
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
 
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
 
Data_Mining.ppt
Data_Mining.pptData_Mining.ppt
Data_Mining.ppt
 
Art of Science : choosing the best graphical representation to make decision
Art of Science : choosing the best graphical representation to make decisionArt of Science : choosing the best graphical representation to make decision
Art of Science : choosing the best graphical representation to make decision
 
Big data: why, what, paradigm shifts enabled , tools and market landscape
Big data: why, what, paradigm shifts enabled , tools and market landscapeBig data: why, what, paradigm shifts enabled , tools and market landscape
Big data: why, what, paradigm shifts enabled , tools and market landscape
 
Engage 2017 Watson Analytics - Socialytics, accelerating IBM Connections ado...
Engage 2017  Watson Analytics - Socialytics, accelerating IBM Connections ado...Engage 2017  Watson Analytics - Socialytics, accelerating IBM Connections ado...
Engage 2017 Watson Analytics - Socialytics, accelerating IBM Connections ado...
 
Impact of big data on analytics
Impact of big data on analyticsImpact of big data on analytics
Impact of big data on analytics
 
How To Activate Employee Engagement Through Digital Transformation
How To Activate Employee Engagement Through Digital TransformationHow To Activate Employee Engagement Through Digital Transformation
How To Activate Employee Engagement Through Digital Transformation
 
Data Driven Growth - Amplitude London Product Analytics Summit
Data Driven Growth - Amplitude London Product Analytics SummitData Driven Growth - Amplitude London Product Analytics Summit
Data Driven Growth - Amplitude London Product Analytics Summit
 
Social Connections 14 - Watson Analytics: accelerate your Connections adoption
Social Connections 14 - Watson Analytics: accelerate your Connections adoptionSocial Connections 14 - Watson Analytics: accelerate your Connections adoption
Social Connections 14 - Watson Analytics: accelerate your Connections adoption
 
Applying Data Quality Best Practices at Big Data Scale
Applying Data Quality Best Practices at Big Data ScaleApplying Data Quality Best Practices at Big Data Scale
Applying Data Quality Best Practices at Big Data Scale
 
Using Data to Inform Information Architecture and User Experience
Using Data to Inform Information Architecture and User ExperienceUsing Data to Inform Information Architecture and User Experience
Using Data to Inform Information Architecture and User Experience
 
"Data Informed vs Data Driven" by Casper Sermsuksan (Kulina)
"Data Informed vs Data Driven" by Casper Sermsuksan (Kulina)"Data Informed vs Data Driven" by Casper Sermsuksan (Kulina)
"Data Informed vs Data Driven" by Casper Sermsuksan (Kulina)
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
 
pixelcamp
pixelcamppixelcamp
pixelcamp
 
big data analytics pgpmx2015
big data analytics pgpmx2015big data analytics pgpmx2015
big data analytics pgpmx2015
 
Modern Metadata Strategies
Modern Metadata StrategiesModern Metadata Strategies
Modern Metadata Strategies
 
Data-driven Growth - Analytics & Attribution for Marketers in 2016 | Turing F...
Data-driven Growth - Analytics & Attribution for Marketers in 2016 | Turing F...Data-driven Growth - Analytics & Attribution for Marketers in 2016 | Turing F...
Data-driven Growth - Analytics & Attribution for Marketers in 2016 | Turing F...
 
Andy Young — Data-Driven Growth: Analytics Tools and Tips for Marketers in 20...
Andy Young — Data-Driven Growth: Analytics Tools and Tips for Marketers in 20...Andy Young — Data-Driven Growth: Analytics Tools and Tips for Marketers in 20...
Andy Young — Data-Driven Growth: Analytics Tools and Tips for Marketers in 20...
 

More from Emanuele Della Valle

Taming velocity - a tale of four streams
Taming velocity - a tale of four streamsTaming velocity - a tale of four streams
Taming velocity - a tale of four streams
Emanuele Della Valle
 
Stream reasoning
Stream reasoningStream reasoning
Stream reasoning
Emanuele Della Valle
 
Work in progress on Inductive Stream Reasoning
Work in progress on Inductive Stream ReasoningWork in progress on Inductive Stream Reasoning
Work in progress on Inductive Stream Reasoning
Emanuele Della Valle
 
Knowledge graphs in search engines
Knowledge graphs in search enginesKnowledge graphs in search engines
Knowledge graphs in search engines
Emanuele Della Valle
 
La città dei balocchi 2017 in numeri - Fluxedo
La città dei balocchi 2017 in numeri - FluxedoLa città dei balocchi 2017 in numeri - Fluxedo
La città dei balocchi 2017 in numeri - Fluxedo
Emanuele Della Valle
 
Stream Reasoning: a summary of ten years of research and a vision for the nex...
Stream Reasoning: a summary of ten years of research and a vision for the nex...Stream Reasoning: a summary of ten years of research and a vision for the nex...
Stream Reasoning: a summary of ten years of research and a vision for the nex...
Emanuele Della Valle
 
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
Emanuele Della Valle
 
Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Stream reasoning: an approach to tame the velocity and variety dimensions of ...Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Emanuele Della Valle
 
Big Data: how to use it to create value
Big Data: how to use it to create valueBig Data: how to use it to create value
Big Data: how to use it to create value
Emanuele Della Valle
 
Listening to the pulse of our cities with Stream Reasoning (and few more tech...
Listening to the pulse of our cities with Stream Reasoning (and few more tech...Listening to the pulse of our cities with Stream Reasoning (and few more tech...
Listening to the pulse of our cities with Stream Reasoning (and few more tech...
Emanuele Della Valle
 
Ist16-04 An introduction to RDF
Ist16-04 An introduction to RDF Ist16-04 An introduction to RDF
Ist16-04 An introduction to RDF
Emanuele Della Valle
 
Ist16-03 An Introduction to the Semantic Web
Ist16-03 An Introduction to the Semantic Web Ist16-03 An Introduction to the Semantic Web
Ist16-03 An Introduction to the Semantic Web
Emanuele Della Valle
 
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Emanuele Della Valle
 
IST16-01 - Introduction to Interoperability and Semantic Technologies
IST16-01 - Introduction to Interoperability and Semantic TechnologiesIST16-01 - Introduction to Interoperability and Semantic Technologies
IST16-01 - Introduction to Interoperability and Semantic Technologies
Emanuele Della Valle
 
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Emanuele Della Valle
 
On Stream Reasoning
On Stream ReasoningOn Stream Reasoning
On Stream Reasoning
Emanuele Della Valle
 
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Emanuele Della Valle
 
Social listener-brera-design-district-2015-03
Social listener-brera-design-district-2015-03Social listener-brera-design-district-2015-03
Social listener-brera-design-district-2015-03
Emanuele Della Valle
 
City Data Fusion for Event Management (in Italiano)
City Data Fusion for Event Management (in Italiano)City Data Fusion for Event Management (in Italiano)
City Data Fusion for Event Management (in Italiano)
Emanuele Della Valle
 
Semantic technologies and Interoperability
Semantic technologies and InteroperabilitySemantic technologies and Interoperability
Semantic technologies and Interoperability
Emanuele Della Valle
 

More from Emanuele Della Valle (20)

Taming velocity - a tale of four streams
Taming velocity - a tale of four streamsTaming velocity - a tale of four streams
Taming velocity - a tale of four streams
 
Stream reasoning
Stream reasoningStream reasoning
Stream reasoning
 
Work in progress on Inductive Stream Reasoning
Work in progress on Inductive Stream ReasoningWork in progress on Inductive Stream Reasoning
Work in progress on Inductive Stream Reasoning
 
Knowledge graphs in search engines
Knowledge graphs in search enginesKnowledge graphs in search engines
Knowledge graphs in search engines
 
La città dei balocchi 2017 in numeri - Fluxedo
La città dei balocchi 2017 in numeri - FluxedoLa città dei balocchi 2017 in numeri - Fluxedo
La città dei balocchi 2017 in numeri - Fluxedo
 
Stream Reasoning: a summary of ten years of research and a vision for the nex...
Stream Reasoning: a summary of ten years of research and a vision for the nex...Stream Reasoning: a summary of ten years of research and a vision for the nex...
Stream Reasoning: a summary of ten years of research and a vision for the nex...
 
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
ACQUA: Approximate Continuous Query Answering over Streams and Dynamic Linked...
 
Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Stream reasoning: an approach to tame the velocity and variety dimensions of ...Stream reasoning: an approach to tame the velocity and variety dimensions of ...
Stream reasoning: an approach to tame the velocity and variety dimensions of ...
 
Big Data: how to use it to create value
Big Data: how to use it to create valueBig Data: how to use it to create value
Big Data: how to use it to create value
 
Listening to the pulse of our cities with Stream Reasoning (and few more tech...
Listening to the pulse of our cities with Stream Reasoning (and few more tech...Listening to the pulse of our cities with Stream Reasoning (and few more tech...
Listening to the pulse of our cities with Stream Reasoning (and few more tech...
 
Ist16-04 An introduction to RDF
Ist16-04 An introduction to RDF Ist16-04 An introduction to RDF
Ist16-04 An introduction to RDF
 
Ist16-03 An Introduction to the Semantic Web
Ist16-03 An Introduction to the Semantic Web Ist16-03 An Introduction to the Semantic Web
Ist16-03 An Introduction to the Semantic Web
 
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
Ist16-02 HL7 from v2 (syntax) to v3 (semantics)
 
IST16-01 - Introduction to Interoperability and Semantic Technologies
IST16-01 - Introduction to Interoperability and Semantic TechnologiesIST16-01 - Introduction to Interoperability and Semantic Technologies
IST16-01 - Introduction to Interoperability and Semantic Technologies
 
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
Stream reasoning: mastering the velocity and the variety dimensions of Big Da...
 
On Stream Reasoning
On Stream ReasoningOn Stream Reasoning
On Stream Reasoning
 
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
 
Social listener-brera-design-district-2015-03
Social listener-brera-design-district-2015-03Social listener-brera-design-district-2015-03
Social listener-brera-design-district-2015-03
 
City Data Fusion for Event Management (in Italiano)
City Data Fusion for Event Management (in Italiano)City Data Fusion for Event Management (in Italiano)
City Data Fusion for Event Management (in Italiano)
 
Semantic technologies and Interoperability
Semantic technologies and InteroperabilitySemantic technologies and Interoperability
Semantic technologies and Interoperability
 

Recently uploaded

Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
James Polillo
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 

Recently uploaded (20)

Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 

Big Data and Data Science W's

  • 1. Big Data & Data Science W's Emanuele Della Valle @manudellavalle Prof. @polimi & Founder @fluxedo_
  • 2. W's 18/06/2018 @manudellavalle - http://emanueledellavalle.org 2
  • 3. Why? • In many organizations decisions are made by "questionable" methodologies such as – Highest Paid Person Opinion (HiPPO) – Flipism (all decisions are made by flipping a coin) 18/06/2018 @manudellavalle - http://emanueledellavalle.org 3
  • 4. Why? Highest Paid Person Opinion (HiPPO) 18/06/2018 @manudellavalle - http://emanueledellavalle.org 4
  • 5. Why? Flipism (all decisions are made by flipping a coin) 18/06/2018 @manudellavalle - http://emanueledellavalle.org 5
  • 6. Why? • In many organizations decisions are made by the "questionable" methodologies such as – Highest Paid Person Opinion (HiPPO) – Flipism (all decisions are made by flipping a coin) • This could have been the right approach in the '70s … – See the "Theory of Bounded Rationality" by Herbert Simons 18/06/2018 @manudellavalle - http://emanueledellavalle.org 6
  • 7. Why? 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source http://www.azquotes.com/quote/139996 ] 7
  • 8. Why? • In many organizations decisions are made by the "questionable" methodologies such as – Highest Paid Person Opinion (HiPPO) – Flipism (all decisions are made by flipping a coin) • This could have been the right approach in the '70s … – See the "Theory of Bounded Rationality" by Herbert Simons • … but in the Big Data era one can dream of data-driven organization 18/06/2018 @manudellavalle - http://emanueledellavalle.org 8
  • 9. Why? • Data-Driven Organization 18/06/2018 @manudellavalle - http://emanueledellavalle.org 9
  • 10. Why? Decisions no longer have to be made in the dark or based on gut instinct; they can be based on evidence, experiments and more accurate forecasts. -- McKinsey 18/06/2018 @manudellavalle - http://emanueledellavalle.org 10
  • 11. Why? • Data-driven organizations – perform better • The data shows where they can streamline their processes – are operationally more predictable • Data insights fuel current and future decision making – are more profitable • Constant improvements and better predictions help to outsmart the competition and improve innovation. 18/06/2018 @manudellavalle - http://emanueledellavalle.org 11
  • 12. Why? • Moneyball: data + analysis to win games 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source: https://www.imdb.com/title/tt1210166/ ] 12
  • 13. What's Big Data? [source: IBM, 2012] 18/06/2018 @manudellavalle - http://emanueledellavalle.org 13
  • 14. What's Big Data? [source: IBM, 2012] 18/06/2018 @manudellavalle - http://emanueledellavalle.org 14
  • 15. What's Big Data? [source: IBM, 2012] 18/06/2018 @manudellavalle - http://emanueledellavalle.org 15
  • 16. What's Big Data? [source: IBM, 2012] 18/06/2018 @manudellavalle - http://emanueledellavalle.org 16
  • 17. What's Big Data? [source: IBM, 2012] 18/06/2018 @manudellavalle - http://emanueledellavalle.org 17
  • 18. What's Big Data? • Big Data is "crude oil" … that we have to – Extract – Transport in mega-tankers – Ship through pipelines – Store in massive silos – … 18/06/2018 @manudellavalle - http://emanueledellavalle.org 18
  • 19. What's Data Science? • Data Science is "refining crude oil" 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source:http://allabtinstru.blogspot.com/2016/09/ProcessofRefiningCrudeOil.html] 19
  • 20. What's Data Science? • The Science [and Art] of… – Discovering what we don’t know from data – Obtaining predictive, actionable insight from data – Creating Data Products that have business impact now – Communicating relevant business stories from data – Building confidence in decisions that drive business value 18/06/2018 @manudellavalle - http://emanueledellavalle.org 20
  • 21. Who's a Data Scientist? • Drew Conway, 2010 18/06/2018 @manudellavalle - http://emanueledellavalle.org 21
  • 22. How? • Statistics starts with data • Two goals of analyzing data – Descriptions: how nature associates responses to inputs – Predictions: response for future input variables [source: Statistical Modeling: The Two Cultures. Leo Breiman, 2001] 18/06/2018 @manudellavalle - http://emanueledellavalle.org nature xy independent variable response variable 22
  • 23. How? [source: Marc Andrews, 2014] Leverage more of the data being captured 18/06/2018 @manudellavalle - http://emanueledellavalle.org 23
  • 24. How? [source: Marc Andrews, 2014] Leverage more of the data being captured 18/06/2018 @manudellavalle - http://emanueledellavalle.org 24
  • 25. How? [source: Marc Andrews, 2014] Leverage more of the data being captured 18/06/2018 @manudellavalle - http://emanueledellavalle.org 25
  • 26. How? 18/06/2018 @manudellavalle - http://emanueledellavalle.org Reduce effort required to leverage data [source: Marc Andrews, 2014] 26
  • 27. How? 18/06/2018 @manudellavalle - http://emanueledellavalle.org Reduce effort required to leverage data [source: Marc Andrews, 2014] 27
  • 28. What? 18/06/2018 @manudellavalle - http://emanueledellavalle.org Reduce effort required to leverage data [source: Marc Andrews, 2014] 28
  • 29. How? Data-driven exploration looking for correlation 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source: Marc Andrews, 2014] 29
  • 30. How? Data-driven exploration looking for correlation 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source: Marc Andrews, 2014] 30
  • 31. Your butcher … 18/06/2018 @manudellavalle - http://emanueledellavalle.org 31
  • 32. … at scale! 18/06/2018 @manudellavalle - http://emanueledellavalle.org 32
  • 33. How? Leverage data as it is captured 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source: Marc Andrews, 2014] 33
  • 34. How? Leverage data as it is captured 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source: Marc Andrews, 2014] 34
  • 35. How? Leverage data as it is captured 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source: Marc Andrews, 2014] 35
  • 36. How? 18/06/2018 @manudellavalle - http://emanueledellavalle.org [sourcehttps://docs.microsoft.com/en-us/azure/machine-learning/team-data-science-process/] 36
  • 37. How? Overall picture by Gartner 18/06/2018 @manudellavalle - http://emanueledellavalle.org 37
  • 38. Where? 18/06/2018 @manudellavalle - http://emanueledellavalle.org [source https://www.ted.com/talks/anne_milgram_why_smart_statistics_are_the_key_to_fighting_crime ] Improve public safety and reduce violent crime through data analytics -41% murders | -27% crimes 38
  • 39. Where? 18/06/2018 @manudellavalle - http://emanueledellavalle.org 39
  • 40. Where? 18/06/2018 @manudellavalle - http://emanueledellavalle.org 40
  • 41. What about cybersec? 18/06/2018 @manudellavalle - http://emanueledellavalle.org 41
  • 42. Credits • Big Data [sorry] & Data Science: What Does a Data Scientist Do? Carlos Somohano, 2013 – https://www.slideshare.net/datasciencelondon/big-data-sorry-data- science-what-does-a-data-scientist-do-world • Becoming a data-driven organization The what, why and how. SAS, 2018 – https://www.sas.com/en_us/whitepapers/becoming-data-driven- organization-109150.html • Never trust summary statistics alone; always visualize your data. Alberto Cairo, 2016 – http://www.thefunctionalart.com/2016/08/download-datasaurus- never-trust-summary.html • 2017 Planning Guide for Data and Analytics. John Hagerty (Gartner), 2016 – https://www.gartner.com/binaries/content/assets/events/keywords/ catalyst/catus8/2017_planning_guide_for_data_analytics.pdf 18/06/2018 @manudellavalle - http://emanueledellavalle.org 42
  • 43. Thank you! Any Question? Emanuele Della Valle @manudellavalle Prof. @polimi & Founder @fluxedo_