Setting the Scene for Big Data in Europe, Looking Ahead to the Case Studies
1. BYTE:
Big data roadmap and cross-disciplinary community for
addressing societal Externalities
Se*ng the scene for Big Data in Europe,
looking ahead to the case studies
Guillermo Vega-‐Gorgojo – Universitetet i Oslo
2. So far, what we have learned in BYTE?
◦ Big
data,
more
than
“the
3Vs”
◦ Defini7on,
dimensions,
ac7vi7es,
applica7ons,
data
flows,
policies
◦ Big
data
ini7a7ves
◦ Technologies
and
infrastructures
for
big
data
◦ Posi7ve
and
nega7ve
societal
externali7es
◦ Economic,
legal,
social,
ethical,
poli7cal…
@BYTE_EU www.byte-project.eu
3. What we expect to learn through the
case studies?
1. Inves7gate
which
posi%ve
and
nega%ve
societal
externali%es
@BYTE_EU www.byte-project.eu
do
organiza7ons
create
through
the
use
of
big
data
2. How
have
they
worked
to
amplify
posi%ve
externali%es
3. How
have
they
addressed
the
nega%ve
externali%es
they
have
encountered
4. A template for the case studies in BYTE
CASE
STUDY
OVERVIEW
1. Organiza7on
2. Sector
3. Case
study
moQo
4. Execu7ve
summary
5. Business
processes
6. Rela7on
to
big
data
ini7a7ves
7. Illustra7ve
user
stories
SOURCES
OF
INFORMATION
◦ Semi-‐structured
interviews
◦ Organiza7on
documents
TECHNICAL
PERSPECTIVE
8. Data
sources
9. Data
flows
10. Relevant
big
data
policies
11. Main
technical
challenges
12. Big
data
dimensions
SOCIETAL
EXTERNALITIES
13. Posi7ve
societal
externali7es
14. Nega7ve
societal
externali7es
15. Amplifying
posi7ve
externali7es
16. Addressing
nega7ve
externali7es
@BYTE_EU www.byte-project.eu
5. A model for the societal externaliKes
Ci%zens
Public
Sector
@BYTE_EU www.byte-project.eu
Private
Sector
6. Examples of posiKve and negaKve
societal externaliKes
Ci%zens
Public
Sector
+
support
communi7es
-‐
con7nuous
and
invisible
surveillance
@BYTE_EU www.byte-project.eu
+
commercializa7on
of
new
goods
and
services
+
data-‐driven
employment
offerings
Private
Sector
+
innova7ve
business
models
-‐
inequali7es
to
data
access
-‐
need
to
reconcile
different
laws
and
agreements
+
economic
growth
through
community
building
-‐
compe77ve
disadvantage
of
newer
businesses
and
SMEs
-‐
private
data
misuse
-‐
invasive
use
of
informa7on
+
accelerate
scien7fic
progress
+
transparency
and
accountability
-‐
distrust
of
government
data-‐
based
ac7vi7es
7. The case studies
Case
study
Organiza%on
Contact
partner
Environment
ESA
and
others
CNR
Crime
XXX
TRI
Smart
ci7es
Siemens
Siemens
Culture
Europeana
TRI
Energy
Statoil
UiO
Health
Ins7tute
of
Child
Health
TRI
Transport
Rolls
Royce/Farstad
shipping
DNV
@BYTE_EU www.byte-project.eu
8. Preliminary case study analysis for Statoil
Case study overview
1. Organiza%on
Statoil
2. Sector
ENERGY
3. Case
study
moQo
Improve
decision
making
in
oil
&
gas
explora7on
in
the
presence
of
par7al
informa7on
and
limited
7me.
5. Business
processes
Oil
&
gas
explora7on
decision-‐making
6. Rela%on
to
big
data
ini%a%ves
Research
projects:
OPTIQUE
4. Execu%ve
summary
In
the
early
phases
of
the
explora7on
process
of
oil
and
gas
many
prospects,
i.e.
@BYTE_EU www.byte-project.eu
poten%al
projects,
are
at
any
7me
under
evalua7on
in
order
to
select
just
a
few
of
them
for
further
inves7ga7on.
These
decisions
are
oken
of
cri7cal
importance
for
Statoil.
However,
in
most
cases
prospects
have
to
be
selected
on
a
short
no%ce
and
on
the
basis
of
only
par%al
informa%on.
Typically,
explora7on
experts
in
these
very
early
phases
of
an
explora7on
project
spend
just
a
few
days
collec7ng
relevant
informa7on
before
they
embark
on
further
analyses;
the
data
that
is
not
found
within
this
7me
frame
is
then
simply
ignored,
and
will
hence
not
influence
the
important
selec7on
of
prospects.
If
the
geophysics
and
geology
(G&G)
experts
u7lize
all
the
data
available,
this
will
reduce
the
risk
factor
in
the
selec7on
process,
and
hence
also
increase
the
chances
that
the
‘right’
prospects
are
selected.
In
the
end
this
will
in
all
likelihood
increase
the
number
of
successful
explora%on
projects
for
Statoil.
9. Preliminary case study analysis for Statoil
Technical descripKon
8. Data
sources
Name:
Subsurface
Short
descrip7on:
◦ Seismic
survey
◦ Seismic
&
geophysical
data
◦ Well
and
wellbore
data
◦ Acquisi7on
reports
Domain:
geophysics
and
geology
How
is
collected:
◦ Seismic
shots
◦ Well
data
from
drilling
opera7ons
◦ Reports
from
value-‐adding
analysis
Size:
~8
PB
…
11. Main
technical
challenges
Data
storage
and
access:
VERY
CHALLENGING
◦ G&G
experts
in
explora7on
spend
16%
of
their
7me
on
finding
the
relevant
data
sets
and
documents
(internal
survey
of
Statoil
in
2005)
◦ There
is
a
plethora
of
tools
to
access
and
process
the
different
kinds
of
data,
amplified
by
the
segrega7on
into
silos
Data
integra7on:
CHALLENGING
◦ There
is
a
clear
need
to
integrate
the
data
scaQered
across
different
repositories
and
databases
from
mul7ple
vendors.
For
instance,
the
provided
user
story
reflects
that
the
Subsurface
database
was
not
up
to
date
due
to
limited
integra7on
with
the
OpenWorks
project
databases
…
@BYTE_EU www.byte-project.eu
12. Big
data
dimensions
Volume:
YES
◦ Some
datasets
are
at
a
scale
of
PBs
◦ Extremely
complex
queries
that
can
involve
more
than
30
joins
Velocity:
NO
◦ No
streaming
data
processing
Variety:
YES
◦ Need
of
different
data
models
to
reflect
the
views
of
Drilling
Engineers,
Petrophysicists,
Geophysicists,
Geologists
and
Reservoir
Engineers
◦ Very
complex
data
models:
~K
of
tables
and
~10K
columns
Veracity:
YES
◦ Some
of
the
employed
data
sources
are
more
trustworthy
than
others
10. Preliminary case study analysis for Statoil
Societal externaliKes
Statoil
–
Ci%zens
+ Reduced
risk
for
environment
+ Demand
for
hiring
big
data
analysts
Statoil
–
Other
corpora%ons
+ New
work
processes
and
vendor
ecosystems
- Data
lock-‐in,
contracts
prohibit
access
to
data
for
third
par7es
- Increased
risk
of
exposing
confiden7al
data
Statoil
–
Public
sector
+ BeQer
informed
decisions
for
drilling
opera7ons
based
on
open
government
data
(FactPages)
- Compe77ve
advantage
of
the
private
sector
w.r.t
open
data
(Statoil
doesn’t
have
to
open
their
data,
while
it
has
access
to
public
data)
@BYTE_EU www.byte-project.eu
11. Societal externaliKes (1-‐3)
Public
sector
–
Ci%zens
+ Gather
public
insight
by
iden7fying
social
trends
and
sta7s7cs
+ Accelerate
scien7fic
progress
+ Tracking
environmental
challenges
+ Transparency
and
accountability
of
the
public
sector
+ Increased
ci7zen
par7cipa7on
+ Foster
innova7on,
e.g.
new
applica7ons,
from
government
data
+ BeQer
services,
e.g.
health
care
and
educa7on,
through
data
sharing
and
analysis
+ More
targeted
services
for
ci7zens,
through
profiling
popula7ons
+ cost-‐effec7veness
of
services
+ crime
preven7on
and
detec7on,
including
fraud
- Distrust
of
government
data-‐based
ac7vi7es
- Unnecessary
surveillance
- Compromise
to
government
security
and
privacy
- Private
data
misuse,
especially
sharing
with
third
par7es
without
consent
- Threats
to
data
protec7on
and
personal
privacy
- Threats
to
intellectual
property
rights
(including
scholars'
rights
and
contribu7ons)
- Public
reluctance
to
provide
informa7on
(especially
personal
data)
@BYTE_EU www.byte-project.eu
12. Societal externaliKes (2-‐3)
Private
sector
–
Ci%zens
+ Rapid
commercializa7on
of
new
goods
and
services
+ Free
use
of
services,
e.g.
email,
search
engines
+ Enhances
in
data-‐driven
R&D
+ Making
society
energy
efficient
+ Op7miza7on
of
u7li7es
through
data
analy7cs
+ Data-‐driven
employment
offerings
+ Marke7ng
improvement
+ Increased
insight
of
goods
(more
transparency)
+ Increased
transparency
in
commercial
decision
making
+ Fostering
innova7on
from
opening
data
+ Increase
awareness
about
privacy
viola7ons
and
ethical
issues
of
big
data
+ Time-‐saving
in
transac7ons
if
personal
data
were
already
held
- Employment
losses
for
certain
job
categories
- Invasive
use
of
informa7on
- Risk
of
informa7onal
rent-‐seeking
- Discriminatory
prac7ces
and
targeted
adver7sing
- Distrust
of
commercial
data-‐based
ac7vi7es
- Unethical
exploita7on
of
data
- Reduced
market
compe77on
- Consumer
manipula7on
- Crea7on
of
data-‐based
monopolies
(plaxorms
and
services)
- Private
data
accumula7on
and
ownership
- Private
data
leakage
- Private
data
misuse,
especially
sharing
with
third
par7es
without
consent
- Privacy
threats
even
with
anonymized
data
and
with
data
mining
- Threats
to
intellectual
property
rights
- Public
reluctance
to
provide
informa7on
(especially
personal
data)
- “Sabotaged"
data
prac7ces
@BYTE_EU www.byte-project.eu
13. Societal externaliKes (3-‐3)
Ci%zens
–
Ci%zens
+ Support
communi7es
- Con7nuous
and
invisible
surveillance
Private
sector
–
Private
sector
+ Opportuni7es
for
economic
growth
+ Innova7ve
business
models
- Barriers
to
market
entry
- Inequali7es
to
data
access
- Market
manipula7on
- Challenge
of
tradi7onal
non-‐digital
services
- Dependency
on
external
data
sources,
plaxorms
and
services
- Compe77ve
disadvantage
of
newer
businesses
and
SMEs
- Reduced
growth
and
profit
among
all
business
- Threats
to
commercially
valuable
informa7on
Public
sector
–
Private
sector
+ Opportuni7es
for
economic
growth
+ Innova7ve
business
models
+ Support
communi7es
- Open
data
puts
the
private
sector
at
a
compe77ve
@BYTE_EU www.byte-project.eu
advantage
- Inequali7es
to
data
access,
especially
in
research
- Taxa7on
leakages
- Lack
of
norms
for
data
storage
and
processing
Public
sector
–
Public
sector
- Geopoli7cal
tensions
due
to
surveillance
out
of
the
boundaries
of
states
- Need
to
reconcile
different
laws
and
agreements,
e.g.
"right
to
be
forgoQen"
Barriers
to
market
entry