SlideShare a Scribd company logo
1 of 35
Download to read offline
Is Big Data Good or Evil?
Arthur Charpentier and Ewen Gallic
Eco Club
Institut Franco-Américain
Rennes, January 2017
1
● Arthur Charpentier ( @freakonometrics)
○ Assistant Professor, University of Rennes 1
○ Ph. D. in Applied Mathematics, KU Leuven
○ Fellow of the French Institute of Actuaries
○ MSc in Mathematics Applied to Economics, Paris Dauphine University
○ MSc in Statistics, ENSAE, Paris
● Ewen Gallic ( @3wen)
○ Ph. D. Student in Economics, University of Rennes 1
○ MSc in Econometrics, University of Rennes 1
2
What is Big Data?
3
Data from our activity
4
Everyday, data are collected from our activities such as:
● Paying our basket at the grocery store, at the pharmacy, …
● Doing a Google search
● Listening to some music on Spotify
● Sending an e-mail
● Posting an update on a social network
Different kind of data are collected:
Data Created/Shared by us
5
● Texts from conversations on our
phones or on Facebook, Twitter, …
Different kind of data are collected:
Data Created/Shared by us
6
● Texts from conversations on our
phones or on Facebook, Twitter, …
● Photos posted on Instagram, Facebook,
Tumblr, Pinterest, …
Different kind of data are collected:
Data Created/Shared by us
7
● Texts from conversations on our
phones or on Facebook, Twitter, …
● Photos posted on Instagram, Facebook,
Tumblr, Pinterest, …
● Videos uploaded on Snapchat,
YouTube, …
Different kind of data are collected:
Data Created/Shared by us
8
● Texts from conversations on our
phones or on Facebook, Twitter, …
● Photos posted on Instagram, Facebook,
Tumblr, Pinterest, …
● Videos uploaded on Snapchat,
YouTube, …
● Data footprint
● …
Sensors record our activity:
Data Collected by Sensors
9
● GPS tracking on our phones
Source: Michael Wallace
Source: Aaron Parecki, via FlowingData
Sensors record our activity:
Data Collected by Sensors
10
● GPS tracking on our phones
● Activity trackers, Sleep trackers
Fitbit watch
Sensors record our activity:
Data Collected by Sensors
11
● GPS tracking on our phones
● Activity trackers, Sleep trackers
● In-car tracking devices
YouDrive
● Climate data from weather stations
Sensors record our activity:
Data Collected by Sensors
12
● GPS tracking on our phones
● Activity trackers, Sleep trackers
● In-car tracking devices
● …
Weather Station
The 4 Vs of Big Data
Four specific attributes to define Big Data (the four Vs):
13
VOLUME VELOCITY
VERACITY VARIETY
The 4 Vs of Big Data
14
Source: IBM Big Data & Analytics Hub
= 2.5 x 1018
bytes = 2.5 billion gigabytes
~ 357,142,857 three hours HD movies
on Netflix
The 4 Vs of Big Data
15
Source: IBM Big Data & Analytics Hub
The 4 Vs of Big Data
16
Source: IBM Big Data & Analytics Hub
~15% of 2016 US GDP
The 4 Vs of Big Data
17
Source: IBM Big Data & Analytics Hub
● Structured data, e.g.:
○ Time
○ Date
○ Value
○ …
● Unstructured data, e.g.:
○ Video
○ Podcast
○ Social Media Status
○ …
The angel...
18
The Angel: Healthcare
19
● Data-driven analysis to predict a
disease’s geographical spread
○ Ebola in 2014
Map of Ebola cases in West Africa from January
2014 to December 2015.
Source: World Health Organization
The Angel: Healthcare
20
● Data-driven analysis to predict a
disease’s geographical spread
○ Ebola in 2014
● Google Flu Trend
○ Aims at predicting flu
○ Big failure in 2013
Divergence of Google Flu Trends
Source: How accurate is Google Flu Trends?
Keith Winstein (2013)
The Angel: Healthcare
21
● Data-driven analysis to predict a
disease’s geographical spread
○ Ebola in 2014
● Google Flu Trend
○ Aims at predicting flu
○ Big failure in 2013
● Creation of Electronic Health Records
○ Better diagnostic
○ Reduced costs
● Telemetry
○ Fairer premiums
○ Better knowledge of driving habits
○ Drop in number of accidents
The Angel: Insurance
22
Source: TIA Technology
● Telemetry
● Telemetry
○ Fairer premiums
○ Better knowledge of driving habits
○ Drop in number of accidents
The Angel: Insurance
23
● Detection of fraudulent claims
○ In France, the insurance association
(FFA) estimates fraudulent claims to
amount 5% of claims
Fraudulent claims:
€2.5 Billion in 2015
Source: FFA via L'argus de
l'Assurance
72%
24
The Angel: Jobs
● Curricula processing
○ Faster
○ Reduce discrimination?
○ Help to find suitable candidates for a
position
25
72% of résumés are never
seen by human eyes
Source: Cathy O’Neil, “Weapons of Math
Destruction” (2016) Crown
● Adjusting schedules
○ During peak hours and off-peak hours
The Angel: Jobs
● Curricula processing
○ Faster
○ Reduce discrimination?
○ Help to find suitable candidates for a
position
26
Nate Silver, the developer of PECOTA,
Editor-in-chief of FiveThirtyEight
The Angel: Sports
27
● Forecasting Player Performances
○ Nate Smith’s PECOTA
● Providing guidance
○ Kirk Goldsberry
The Angel: Sports
28
● Forecasting Player Performances
○ Nate Smith’s PECOTA
Source: BallR by Todd W. Schneider
(Reproduction of Goldsberry’s chart)
… and the Demon
29
● Growing number of sensors to monitor
our “wellness”
○ Intrusive
○ Insurers get in the way
The Demon: Healthcare
30
Apple Watch
● What about security of the standardized
medical data?
○ What happens if your future employer get his
hands on this kind of data?
The Demon: Insurance
● Moving towards the individual
○ Individual pricing
○ Opaque
○ Blind to inequalities
31
The Demon: Jobs
● Optimized schedules
○ Just-in-time economy applied to human being
○ Work to live, or live to work?
○ Hits people in desperate need of money
● Curricula processing
○ Asymmetric information
○ No feedback on rejected candidates
● Algorithms designed to fire people
○ Opaque procedures
○ Unfair
32
Recap
● Data are created or shared by us, or even recorded by sensors
● A huge amount is created every day, and comes in different forms (text, video, …)
● It may help to increase welfare, or understanding patterns around us…
● But it also contributes to an increase of inequalities
33
To go further...
34
35
Thank you for your attention
Arthur Charpentier
@freakonometrics
Ewen Gallic
@3wen

More Related Content

Similar to Is Big Data Good or Evil

Polina Zvyagina - Airbnb - Privacy & GDPR Compliance - Stanford Engineering -...
Polina Zvyagina - Airbnb - Privacy & GDPR Compliance - Stanford Engineering -...Polina Zvyagina - Airbnb - Privacy & GDPR Compliance - Stanford Engineering -...
Polina Zvyagina - Airbnb - Privacy & GDPR Compliance - Stanford Engineering -...Burton Lee
 
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...BYTE Project
 
Top Tips for Database Perfection: How to Drive Value from your Data Assets
Top Tips for Database Perfection: How to Drive Value from your Data AssetsTop Tips for Database Perfection: How to Drive Value from your Data Assets
Top Tips for Database Perfection: How to Drive Value from your Data Assetsmarketingfinder.co.uk
 
What Does Responsible Data Science Mean?
What Does Responsible Data Science Mean?What Does Responsible Data Science Mean?
What Does Responsible Data Science Mean?Philip Bourne
 
Annual Summit for Digital Innovation in Education 2019
Annual Summit for Digital Innovation in Education 2019Annual Summit for Digital Innovation in Education 2019
Annual Summit for Digital Innovation in Education 2019Rosie Crompton
 
Digitizing Your Impact | 2020 Hunger and Poverty Conference
Digitizing Your Impact | 2020 Hunger and Poverty ConferenceDigitizing Your Impact | 2020 Hunger and Poverty Conference
Digitizing Your Impact | 2020 Hunger and Poverty ConferenceTiasiaOBrien
 
Data Science for Social Good
Data Science for Social GoodData Science for Social Good
Data Science for Social GoodDSP智庫驅動
 
DRL Field Research Philippines: A Journey into the Information Disaster
DRL Field Research Philippines: A Journey into the Information DisasterDRL Field Research Philippines: A Journey into the Information Disaster
DRL Field Research Philippines: A Journey into the Information DisasterTina Comes
 
IAOS 2018 - Statistics as a trusted source of information, M. Durand
IAOS 2018 - Statistics as a trusted source of information, M. DurandIAOS 2018 - Statistics as a trusted source of information, M. Durand
IAOS 2018 - Statistics as a trusted source of information, M. DurandStatsCommunications
 
Research in a data-rich world, or faster, cheaper and closer-to- the-moment r...
Research in a data-rich world, or faster, cheaper and closer-to- the-moment r...Research in a data-rich world, or faster, cheaper and closer-to- the-moment r...
Research in a data-rich world, or faster, cheaper and closer-to- the-moment r...Инга Кныш
 
Lars Lyberg, Inizio: Rapport från konferensen BigSurv18
Lars Lyberg, Inizio: Rapport från konferensen BigSurv18Lars Lyberg, Inizio: Rapport från konferensen BigSurv18
Lars Lyberg, Inizio: Rapport från konferensen BigSurv18Alf Fyhrlund
 
Future of Research Richard Ingleton TNS
Future of Research Richard Ingleton TNSFuture of Research Richard Ingleton TNS
Future of Research Richard Ingleton TNSKantar TNS Finland
 
Open data: for eveyone by everyone by Jason addie
Open data: for eveyone by everyone by Jason addieOpen data: for eveyone by everyone by Jason addie
Open data: for eveyone by everyone by Jason addieDataFest Tbilisi
 
Responsible machine learning at the BBC
Responsible machine learning at the BBCResponsible machine learning at the BBC
Responsible machine learning at the BBCTatiana Al-Chueyr
 

Similar to Is Big Data Good or Evil (20)

Polina Zvyagina - Airbnb - Privacy & GDPR Compliance - Stanford Engineering -...
Polina Zvyagina - Airbnb - Privacy & GDPR Compliance - Stanford Engineering -...Polina Zvyagina - Airbnb - Privacy & GDPR Compliance - Stanford Engineering -...
Polina Zvyagina - Airbnb - Privacy & GDPR Compliance - Stanford Engineering -...
 
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
Methodologies for Addressing Risks and Opportunities Engendered by Big Data T...
 
Top Tips for Database Perfection: How to Drive Value from your Data Assets
Top Tips for Database Perfection: How to Drive Value from your Data AssetsTop Tips for Database Perfection: How to Drive Value from your Data Assets
Top Tips for Database Perfection: How to Drive Value from your Data Assets
 
What Does Responsible Data Science Mean?
What Does Responsible Data Science Mean?What Does Responsible Data Science Mean?
What Does Responsible Data Science Mean?
 
Annual Summit for Digital Innovation in Education 2019
Annual Summit for Digital Innovation in Education 2019Annual Summit for Digital Innovation in Education 2019
Annual Summit for Digital Innovation in Education 2019
 
Digitizing Your Impact | 2020 Hunger and Poverty Conference
Digitizing Your Impact | 2020 Hunger and Poverty ConferenceDigitizing Your Impact | 2020 Hunger and Poverty Conference
Digitizing Your Impact | 2020 Hunger and Poverty Conference
 
PPT - SIGMA-GIZ Academies - Topic 4 - Georgia - User Feedback.pdf
PPT - SIGMA-GIZ Academies - Topic 4 - Georgia - User Feedback.pdfPPT - SIGMA-GIZ Academies - Topic 4 - Georgia - User Feedback.pdf
PPT - SIGMA-GIZ Academies - Topic 4 - Georgia - User Feedback.pdf
 
Data Science for Social Good
Data Science for Social GoodData Science for Social Good
Data Science for Social Good
 
Open Data in Trinidad and Tobago : presentation to civil society
Open Data in Trinidad and Tobago : presentation to civil societyOpen Data in Trinidad and Tobago : presentation to civil society
Open Data in Trinidad and Tobago : presentation to civil society
 
Data Science: Past, Present, and Future
Data Science: Past, Present, and FutureData Science: Past, Present, and Future
Data Science: Past, Present, and Future
 
Spark
SparkSpark
Spark
 
M. Scannapieco, Dai Big Data alle Smart Statistiche
M. Scannapieco, Dai Big Data alle Smart StatisticheM. Scannapieco, Dai Big Data alle Smart Statistiche
M. Scannapieco, Dai Big Data alle Smart Statistiche
 
DRL Field Research Philippines: A Journey into the Information Disaster
DRL Field Research Philippines: A Journey into the Information DisasterDRL Field Research Philippines: A Journey into the Information Disaster
DRL Field Research Philippines: A Journey into the Information Disaster
 
IAOS 2018 - Statistics as a trusted source of information, M. Durand
IAOS 2018 - Statistics as a trusted source of information, M. DurandIAOS 2018 - Statistics as a trusted source of information, M. Durand
IAOS 2018 - Statistics as a trusted source of information, M. Durand
 
Open data: Where do we go from here
Open data: Where do we go from hereOpen data: Where do we go from here
Open data: Where do we go from here
 
Research in a data-rich world, or faster, cheaper and closer-to- the-moment r...
Research in a data-rich world, or faster, cheaper and closer-to- the-moment r...Research in a data-rich world, or faster, cheaper and closer-to- the-moment r...
Research in a data-rich world, or faster, cheaper and closer-to- the-moment r...
 
Lars Lyberg, Inizio: Rapport från konferensen BigSurv18
Lars Lyberg, Inizio: Rapport från konferensen BigSurv18Lars Lyberg, Inizio: Rapport från konferensen BigSurv18
Lars Lyberg, Inizio: Rapport från konferensen BigSurv18
 
Future of Research Richard Ingleton TNS
Future of Research Richard Ingleton TNSFuture of Research Richard Ingleton TNS
Future of Research Richard Ingleton TNS
 
Open data: for eveyone by everyone by Jason addie
Open data: for eveyone by everyone by Jason addieOpen data: for eveyone by everyone by Jason addie
Open data: for eveyone by everyone by Jason addie
 
Responsible machine learning at the BBC
Responsible machine learning at the BBCResponsible machine learning at the BBC
Responsible machine learning at the BBC
 

Recently uploaded

Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 

Recently uploaded (20)

Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 

Is Big Data Good or Evil

  • 1. Is Big Data Good or Evil? Arthur Charpentier and Ewen Gallic Eco Club Institut Franco-Américain Rennes, January 2017 1
  • 2. ● Arthur Charpentier ( @freakonometrics) ○ Assistant Professor, University of Rennes 1 ○ Ph. D. in Applied Mathematics, KU Leuven ○ Fellow of the French Institute of Actuaries ○ MSc in Mathematics Applied to Economics, Paris Dauphine University ○ MSc in Statistics, ENSAE, Paris ● Ewen Gallic ( @3wen) ○ Ph. D. Student in Economics, University of Rennes 1 ○ MSc in Econometrics, University of Rennes 1 2
  • 3. What is Big Data? 3
  • 4. Data from our activity 4 Everyday, data are collected from our activities such as: ● Paying our basket at the grocery store, at the pharmacy, … ● Doing a Google search ● Listening to some music on Spotify ● Sending an e-mail ● Posting an update on a social network
  • 5. Different kind of data are collected: Data Created/Shared by us 5 ● Texts from conversations on our phones or on Facebook, Twitter, …
  • 6. Different kind of data are collected: Data Created/Shared by us 6 ● Texts from conversations on our phones or on Facebook, Twitter, … ● Photos posted on Instagram, Facebook, Tumblr, Pinterest, …
  • 7. Different kind of data are collected: Data Created/Shared by us 7 ● Texts from conversations on our phones or on Facebook, Twitter, … ● Photos posted on Instagram, Facebook, Tumblr, Pinterest, … ● Videos uploaded on Snapchat, YouTube, …
  • 8. Different kind of data are collected: Data Created/Shared by us 8 ● Texts from conversations on our phones or on Facebook, Twitter, … ● Photos posted on Instagram, Facebook, Tumblr, Pinterest, … ● Videos uploaded on Snapchat, YouTube, … ● Data footprint ● …
  • 9. Sensors record our activity: Data Collected by Sensors 9 ● GPS tracking on our phones Source: Michael Wallace Source: Aaron Parecki, via FlowingData
  • 10. Sensors record our activity: Data Collected by Sensors 10 ● GPS tracking on our phones ● Activity trackers, Sleep trackers Fitbit watch
  • 11. Sensors record our activity: Data Collected by Sensors 11 ● GPS tracking on our phones ● Activity trackers, Sleep trackers ● In-car tracking devices YouDrive
  • 12. ● Climate data from weather stations Sensors record our activity: Data Collected by Sensors 12 ● GPS tracking on our phones ● Activity trackers, Sleep trackers ● In-car tracking devices ● … Weather Station
  • 13. The 4 Vs of Big Data Four specific attributes to define Big Data (the four Vs): 13 VOLUME VELOCITY VERACITY VARIETY
  • 14. The 4 Vs of Big Data 14 Source: IBM Big Data & Analytics Hub = 2.5 x 1018 bytes = 2.5 billion gigabytes ~ 357,142,857 three hours HD movies on Netflix
  • 15. The 4 Vs of Big Data 15 Source: IBM Big Data & Analytics Hub
  • 16. The 4 Vs of Big Data 16 Source: IBM Big Data & Analytics Hub ~15% of 2016 US GDP
  • 17. The 4 Vs of Big Data 17 Source: IBM Big Data & Analytics Hub ● Structured data, e.g.: ○ Time ○ Date ○ Value ○ … ● Unstructured data, e.g.: ○ Video ○ Podcast ○ Social Media Status ○ …
  • 19. The Angel: Healthcare 19 ● Data-driven analysis to predict a disease’s geographical spread ○ Ebola in 2014 Map of Ebola cases in West Africa from January 2014 to December 2015. Source: World Health Organization
  • 20. The Angel: Healthcare 20 ● Data-driven analysis to predict a disease’s geographical spread ○ Ebola in 2014 ● Google Flu Trend ○ Aims at predicting flu ○ Big failure in 2013 Divergence of Google Flu Trends Source: How accurate is Google Flu Trends? Keith Winstein (2013)
  • 21. The Angel: Healthcare 21 ● Data-driven analysis to predict a disease’s geographical spread ○ Ebola in 2014 ● Google Flu Trend ○ Aims at predicting flu ○ Big failure in 2013 ● Creation of Electronic Health Records ○ Better diagnostic ○ Reduced costs
  • 22. ● Telemetry ○ Fairer premiums ○ Better knowledge of driving habits ○ Drop in number of accidents The Angel: Insurance 22 Source: TIA Technology ● Telemetry
  • 23. ● Telemetry ○ Fairer premiums ○ Better knowledge of driving habits ○ Drop in number of accidents The Angel: Insurance 23 ● Detection of fraudulent claims ○ In France, the insurance association (FFA) estimates fraudulent claims to amount 5% of claims Fraudulent claims: €2.5 Billion in 2015 Source: FFA via L'argus de l'Assurance
  • 25. The Angel: Jobs ● Curricula processing ○ Faster ○ Reduce discrimination? ○ Help to find suitable candidates for a position 25 72% of résumés are never seen by human eyes Source: Cathy O’Neil, “Weapons of Math Destruction” (2016) Crown
  • 26. ● Adjusting schedules ○ During peak hours and off-peak hours The Angel: Jobs ● Curricula processing ○ Faster ○ Reduce discrimination? ○ Help to find suitable candidates for a position 26
  • 27. Nate Silver, the developer of PECOTA, Editor-in-chief of FiveThirtyEight The Angel: Sports 27 ● Forecasting Player Performances ○ Nate Smith’s PECOTA
  • 28. ● Providing guidance ○ Kirk Goldsberry The Angel: Sports 28 ● Forecasting Player Performances ○ Nate Smith’s PECOTA Source: BallR by Todd W. Schneider (Reproduction of Goldsberry’s chart)
  • 29. … and the Demon 29
  • 30. ● Growing number of sensors to monitor our “wellness” ○ Intrusive ○ Insurers get in the way The Demon: Healthcare 30 Apple Watch ● What about security of the standardized medical data? ○ What happens if your future employer get his hands on this kind of data?
  • 31. The Demon: Insurance ● Moving towards the individual ○ Individual pricing ○ Opaque ○ Blind to inequalities 31
  • 32. The Demon: Jobs ● Optimized schedules ○ Just-in-time economy applied to human being ○ Work to live, or live to work? ○ Hits people in desperate need of money ● Curricula processing ○ Asymmetric information ○ No feedback on rejected candidates ● Algorithms designed to fire people ○ Opaque procedures ○ Unfair 32
  • 33. Recap ● Data are created or shared by us, or even recorded by sensors ● A huge amount is created every day, and comes in different forms (text, video, …) ● It may help to increase welfare, or understanding patterns around us… ● But it also contributes to an increase of inequalities 33
  • 35. 35 Thank you for your attention Arthur Charpentier @freakonometrics Ewen Gallic @3wen