El impacto de open data en el mundo y en Venezuela. Profesora Maria Esther Vidal. Universidad Simón Bolivar. Presentacion realizada durante el boot camp sobre periodismo de datos-Venezuela.
9953056974 Young Call Girls In Kirti Nagar Indian Quality Escort service
Presentación Prof. Maria Esther Vida. DataBootCampVE/31 octubre 2013
1. Impact of Open Data and
Linked Open Data
Venezuela
Maria-Esther Vidal
Universidad Simón Bolívar
mvidal@ldc.usb.ve
h1p://www.ldc.usb.ve/~mvidal
Twi1er
@Maria11576561
Skype:
mevs2006
1
2. Lights
around
the
London's
2012
Olympic
stadium
describe
Sir
Tim
Berners-‐Lee's
invenKon,
the
World
Wide
Web.
The
Open
Data
InsKtute,
which
he
co-‐founded,
declares
a
mandate
of
'Knowledge
for
Everyone'.
3. “The
ODI
announced
new
13
nodes:
US,
Canada,
France,
Dubai,
Italy,
Russia,
Sweden
and
ArgenKna.”
Oct
29
2103
Sir
Tim
Berners-‐Lee
(right)
and
Sir
Nigel
Shadbolt
(leT)
4. Agenda
Ø Open Data
Ø Linked Open Data
ü Linked Open Data in Journalism
Ø Linked Open Data Applications
ü Linked Open Data at USB
Ø Conclusions and Future Directions
6. Open Data
Definition http://opendefinition.org/:
“A piece of data or content is open if anyone is
free to use, reuse, and redistribute it — subject
only, at most, to the requirement to attribute
and/or share-alike.”
Open_Data_stickers.jpg 1,024×768 pixels
7/1/13 9:33 PM
Availability and access
Reuse and Distribution
Universal Participation
6
7. Open Data
Availability and Access:
Data should be available as a whole,
preferably downloading via the Internet.
Data should be available in a convenient
format.
Should be free or at most at a reproduction
cost.
7
8. Open Data
Reuse and Distribution:
Data should be offered in a way that it can be
reused, distributed and be interrelated with
other datasets.
8
9. Open Data
Universal Participation:
Any person should be able to use, reuse and
distribute.
NO discrimination:
Commercial vs. NOT commercial
Educational vs. NOT educational
Profit vs. No Profit
9
12. Why Open Data?
Avoid
CorrupKon
Wealth
Only
in
Europe
over
140
billion
of
euros
per
year
h1p://www.economist.com/news/business/21578084-‐making-‐official-‐data-‐publi
could-‐spur-‐lots-‐innovaKon-‐new-‐goldmine
12
15. Why Open Data?
Citizens can express themselves and unite so that their
voices can be heard.
15
h1p://www.ted.com/talks/sanjay_pradhan_how_open_data_is_changing_internaKonal_aid.html
17. What is and what is not Open Data
Open
Data.
“A piece of content or data is open if you are free to use,
reuse, and redistribute it — subject only, at most, to the
requirement to attribute and share-alike.”
Difference between open data and data that is publicly available lies in the use of
formats that may be read, used and redistributed by any citizen.
Examples of public data that is not open data: data in spreadsheets, pdf, etc. Usually
open data are csv.
h1p://opensource.com/government/10/12/what-‐“open-‐data”-‐means-‐–-‐and-‐what-‐it-‐doesn’t
18. Opening Up Data
Rules
Ø Keep it simple
Ø Engage early and
engage often
Ø Address common fears
and misunderstandings
Four Steps
Ø Choose your Dataset(s)
Ø Apply an Open License
Ø Make the data available
Ø Make it discoverable
19. Open Data Conditions
Data Providers Requirements
Distributing Open Data
Ø Attribution: data
providers may require to
receive credit.
Ø Integrity: data providers
may require that users
indicate if data change.
Ø Share-alike: data
providers may impose
that any dataset created
using their data are also
open.
Ø Data is machine-readable
Ø Data is available in bulk
more than using an API.
h1p://opendatahandbook.org/en/
55. MoKvaKon
SemanKc
Web
EvoluKon
The Linked Open Data cloud, using the
Web to connect related data that was not
previously linked!
Published Data are enhanced with semantics!
Standards to annotate and describe data:
XML, RDF, RDFS, OWL.
Standards to query data: SPARQL.
Ontologies representing almost any domain.
Hyperlink-based systems.
Protocols: http, uri, html
Documents and data were published
Arpanet: four servers connected
Files were transferred
Tools: ftp, telnet, e-mail
80’s
IRMLs
2010-‐ESWC
2010
90’s
00’s
Now
56. The Linked Open Data Cloud
• Explosion in the number of:
– Linking Open Data
resources and databases
– Different quality
parameters.
Molecular databases 1170, 95 more
– Controlledthan 2008 and 110 more than the year before !
vocabularies:
– MeSH, GO, PO… tools published
Services and
– Highly interconnected
by these databases follow a similar progression!
data sources:
In October 2007, Cloud of Linked Data
Different Sizes
datasets consisted of over two billion RDF triples,
Many links
which were interlinked by over two million RDF links.
• Different in- and outBy May 2009 this had grown to 4.2 billion RDF triples,
degrees, etc
interlinked by around 142 billions RDF links! Today
• Biological Web: large
the Linked Open Data cloud has at least 295 datasets,
datasets of linking data.
31,634,213,770
triples, and 503,998,829
links.
• Genes, Diseases,
Clinical Drugs, Proteins,
and so on.
62. Open Data in Journalism
Ø It may be trendy but not new.
Ø Open Data implies Open Data Journalism.
Ø Data is not necessarily curated.
Ø Bigger Datasets and Small Things.
Ø Data Journalism is 80% perspiration, 10% great
ideas, 10% output.
Ø Long and short-form.
Ø Anyone can do it.
Ø Visualization is important.
Ø Data publishers do not have to be programmers.
Ø It is all about stories.
h1p://www.theguardian.com/news/datablog/2011/jul/28/data-‐journalism
63. Breaking
News
Open
Data
Running
Events
Shared
Data
Open Data in
Journalism
64. Breaking
News
Open
Data
Open Data in
Journalism
Running
Events
Shared
Data
• Data Cleansing
• Conflict Resolution
Data
IntegraKon
SemanKficaKon
• Meta-Data
Annotation
• Vocabularies
• Visualization
• Publishing the Story
PublicaKon
65. Meta-Data
BBC News
This will help users to
find news content about
the stories they want to
know about and
ultimately help to open
up references to the
data contained in
those stories.
h1p://www.bbc.co.uk/blogs/internet/posts/News-‐Linked-‐Data-‐Ontology
66. Data Management ToolsBBC News
h1p://www.bbc.co.uk/blogs/internet/posts/Linked-‐Data-‐ConnecKng-‐together-‐the-‐BBCs-‐Online-‐Content
74. Challenges
for
Linked
Data
Visualization
• Enabling
user
interacKon
– Users
must
be
able
to
navigate
through
the
data
by
exploiKng
the
connecKons
between
Linked
Data
resources
– The
user
might
edit
the
underlying
data
to
enrich
it
by:
• CreaKng
addiKonal
metadata
• HighlighKng
or
correcKng
errors
• ValidaKng
data
• SupporKng
data
reusability
– The
output
(the
plo1ed
data
or
the
visualizaKon
itself)
might
be
encoded
using
standard
ontologies
and
vocabularies
• Scalability
– Linked
Data
visualizaKon
techniques
should
support
the
display
of
large
amount
of
data
in
an
efficient
way
EUCLID
–
InteracKon
with
Linked
Data
74
75. Challenges
for
Linked
Open
Data
Visualization
• ExtracKng
data
from
different
repositories
– A
Linked
Data
set
might
be
parKKoned
into
several
repositories
– The
region
of
interest
(ROI)
might
include
data
from
different
data
sets,
requiring
the
access
to
distributed
repositories
• Handling
heterogeneous
data
– The
same
data
(concepts)
might
be
modeled
differently,
for
example,
using
different
vocabularies
– Certain
values
might
have
different
formats,
for
example,
dates
represented
as
DD-‐MM-‐YYYY,
MM-‐DD-‐YYYY
or
just
YYYY
• Dealing
with
missing
values
– Due
to
the
semi-‐structuredness
of
Linked
Data,
some
instances
might
have
missing
values
for
certain
properKes
EUCLID
–
InteracKon
with
Linked
Data
75
76. Linked
Open
Data
VisualizaKon
Techniques
View
EUCLID
–
InteracKon
with
Linked
Data
76
77. Comparison
of
A1ributes
/
Values
Bar/column
chart
Pie
chart
Allows
the
comparison
of
values
of
different
categories.
Useful
for
performing
comparison
of
percentages
or
proporKons.
Image
source:
h1p://musicbrainz.fluidops.net
Image
source:
h1p://mbostock.github.io/protovis/
Line
chart
Histogram
Allows
visualizing
data
as
a
series
of
data
points,
where
the
measurement
points
(x-‐axis)
are
ordered.
Graphical
representaKon
of
the
distribuKon
of
the
data.
Image
source:
h1p://mbostock.github.io/protovis/
Image
source:
h1p://musicbrainz.fluidops.net
EUCLID
–
InteracKon
with
Linked
Data
77
78. Analysis
of
RelaKonships
and
Hierarchies
Graph
Arc
diagram
The
data
entries
are
represented
as
nodes
and
the
links
as
edges.
The
nodes
are
displayed
in
one
dimension,
and
the
arcs
represent
the
connecKons.
Adjacency
Matrix
diagram
Node-‐link
visualizaKons
The
nodes
are
displayed
as
rows
and
columns,
and
the
links
between
the
nodes
are
entries
in
the
matrix.
The
data
is
organized
in
hierarchies.
Source
of
images:
h1p://mbostock.github.io/protovis/
EUCLID
–
InteracKon
with
Linked
Data
78
79. Analysis
of
RelaKonships
and
Hierarchies
(2)
Space-‐filling
techniques
Treemaps
Icicles
and
sunburst
Subdivide
area
into
rectangles.
Hierarchies
are
represented
by
adjacencies.
Circle-‐packing
Rose
diagrams
Containment
is
used
to
represent
the
hierarchies.
Areas
are
equal
angles
and
the
data
is
represented
by
the
extension
of
the
area.
Source
of
images:
h1p://mbostock.github.io/protovis/
EUCLID
–
InteracKon
with
Linked
Data
79
80. Analysis
of
Temporal
or
Geographical
Events
ConKnuous
data
in
Kme
Timeline
Discrete
data
points
in
Kme
Source:
h1p//musicbrainz.fluidops.net
Source:
h1p://www.ko1ke.org/08/08/2008-‐movie-‐box-‐office-‐chart
Display
geo-‐points
on
a
map
Choropleth
maps
Dorling
cartograms
Aggregate
data
by
geographical
area
Aggregate
data
and
replace
each
area
with
a
circle
Maps
LocaKon
maps
Source:
Google
Map
API
Source:
h1p//musicbrainz.fluidops.net
EUCLID
–
InteracKon
with
Linked
Data
Source:
h1p://mbostock.github.io/protovis/
80
83. Tasks
to
be
Solved
…
Traverse and Consume
Linked Data from the LOD cloud or
locally.
SPARQL endpoints have been developed to access data from the LOD cloud.
83
108. Topological properties
of graphs can be used to identify
patterns that reveal phenomena,
anomalies and potentially lead to
a discovery.
A significant increase of graph data in the form of social biological information.
105
119. Conclusions
Ø Open Data:
ü Transparency
ü Interoperability
ü Avoid Corruption
ü Impulse research and development
ü Data Quality
Ø Linked Open Data:
ü RDF data
ü Linked to existing datasets
ü Endpoints can be used to access data
116