Open Research Problems in Linked Data - WWW2010

•

1 like•1,528 views

These are the Open Research Problems of Linked Data slides that we presented at the Consuming Linked Data tutorial at WWW2010 in Raleigh, NC on April 26, 2010

Technology

Linked
Data:
Open
Research

Problems

Consuming
Linked
Data
Tutorial

World
Wide
Web
Conference
2010

Exactly
1
year
ago
at
WWW2009

•  BOF
on
a
Research
Agenda
for
Linked
Data

•  hHp://esw.w3.org/SweoIG/TaskForces/
CommunityProjects/LinkingOpenData/
MadridBOF

2009’s
Top
10
Linked
Data
Research

Issues

•  Privacy

•  Inter-‐culturalizaVon

•  CreaVng
Linked
Data

–  CreaVng
and
maintaining
links

–  Co-‐reference

–  Design
Methodology

–  NLP
for
link
extracVon

2009’s
Top
10
Linked
Data
Research

Issues

•  UpdaVng
Linked
Data

–  SynchronizaVon
Datasets
and
Links

–  SPARUL
Federated
TransacVons

–  History,
logs,
change
sets,
temporal
tracking

•  EvaluaVon,
metrics
and
benchmarks

•  Trust
and
Provenance

•  Publish
Linked
Data

–  Licensing

–  Legal
and
social
implicaVons

2009’s
Top
10
Linked
Data
Research

Issues

•  UI

–  User
Interface/InteracVon
and
Usability

–  Visualizing
Linked
Data

–  Natural
Language
Interfaces

•  Internet
of
Things
(sensors)

•  Social
and
Economic
Impact

•  Web
Scale
Data
Management

–  Indexing

–  crawling

What
are
the
hot
topics
of
2010

In
Our
Opinion

Hot
Topics

•  Interlinking
Algorithms

•  Provenance
and
Trust

•  Dataset
Dynamics

•  UI

•  Distributed
Query

•  EvaluaVon

–  “You
want
a
good
thesis?
IR
is
based
on
precision
and

recall.
The
minute
you
add
semanVcs,
it
is
a

meaningless
feature.

Logic
is
based
on
soundness
and

completeness.
We
don’t
want
soundness
and

completeness.
We
want
a
few
good
answers
quickly.”

–
Jim
Hendler
at
WWW2009
during
the
LOD
gathering

Thanks
Michael
Hausenblas

Linked
Data
TripliﬁcaVon
Challenge

•  Open
Track

–  Novel
data
sets
that
are
published
as
part
of
the

Web
of
Data
demonstraVng
potenVal
beneﬁt
of

use
within
applicaVons

–  Novel
generic
mechanisms,
approaches
and

technologies
to
publish
Linked
Data

–  ApplicaVons
showcasing
the
beneﬁts
of
Linked

data
to
end-‐users

Linked
Data
TripliﬁcaVon
Challenge

•  Open
Government
Track

–  Build
web
applicaVon
that
makes
use
of
open

government
datasets
(i.e.
environmental,

cadastral,
geographic,
traﬃc,
historical,
public

speeches,
elecVon,
corporate
spending,
etc)

–  At
least
one
source
must
be
part
of
Linked
Open

Data.

Submission
due
MAY
18

hHp://www.triplify.org/Challenge/2010

It’s
lunch
Vme!

•  You
learned
what
Linked
Data
is

•  You
realized
that
there
is
a
lot
of
data
out
on

the
web

•  You
got
excited
about
Linked
Data

•  You
learned
about
the
diﬀerent
ways
of

consuming
linked
data

•  You
realized
that
there
is
sVll
work
to
be
done

•  You
want
to
create
a
Linked
Data
applicaVon

(and
parVcipate
in
the
TripliﬁcaVon
Challenge)

Thank
You

www.consuminglinkeddata.org

@olakarVg

@juansequeda

Linked
Data
Gathering

101
Lounge

444
S.
Blount
Street

Tuesday
aAer
LDOW

This document discusses opportunities and challenges of Linked Data. It begins with an overview of Linked Data principles like using URIs to identify things and linking related things. It then discusses enabling technologies like HTTP URIs and SPARQL queries. Opportunities mentioned include using the LOD cloud as a test bed and benefiting from linked context in applications. Challenges include large-scale processing of Linked Data and quality of links. The document concludes by emphasizing the potential of Linked Data to make data more valuable.

Towards long-term preservation of linked data - the PRELIDA project

PRELIDA Project

This document summarizes a presentation about preserving linked data over the long term. It introduces the PRELIDA project, which aims to bridge the digital preservation and linked data communities. The presentation discusses what digital preservation can provide for linked data, such as file format standards, archival storage services, and documentation practices. It also outlines challenges for preserving linked data, like its dynamic and distributed nature. The PRELIDA project seeks to address these challenges through research and bringing the communities together.

Stronger together: community initiatives in journal management

Jisc

There has been a recent growth of initiatives to address common problems regarding current and long-term access to e-journal content. Jisc is at the forefront of many of these with the close participation and active input of educational institutions. This session aims to summarise the current state of key themes with pointers to future directions of areas such as sustainability, the move towards e-only environments, and shared consortia approaches. It will provide an overview and panel discussion on developing the supporting infrastructure to meet the needs of users. The discussion will focus on how institutions, community bodies and service providers can best work together to ensure sustainable, long-term initiatives by seeking to introduce uniformity, standardisation and collaboration to an even greater extent. The session will introduce two new Jisc-supported projects in this area, the Keepers Registry Extra and SafeNet initiatives, and discuss how these fit alongside existing Jisc services such as Knowledge Base+, UK LOCKSS Alliance, Journal Archives and JUSP (Journal Usage Statistics Portal). The panel will address how this catalogue of services contributes towards a coherent strategy in the management of e-journal content.

Linked Open Data_mlanet13

Kristi Holmes

This document provides an introduction to the Semantic Web and Linked Open Data. It discusses how standards like RDF, XML, and OWL allow machines to better understand the meaning of data on the web. It describes how ontologies provide a vocabulary to define relationships between resources. The document outlines the benefits of publishing data as Linked Open Data using these standards, including making data more interoperable and accessible to both humans and machines. Examples are given of biomedical research projects that use Semantic Web technologies to integrate and link different types of data.

Research in Intelligent Systems and Data Science at the Knowledge Media Insti...

Enrico Motta

The document discusses research directions in intelligent systems and data science. It describes work on making sense of scholarly data through techniques like data mining, semantic technologies, and machine learning. It also discusses mapping and classifying computer science research areas using an automatically generated ontology with over 14,000 topics. Other topics discussed include predicting emerging research areas, applications in smart cities like the MK:Smart project, and potential roles for robots in smart cities like an autonomous health and safety inspector.

Harnessing Collective Intelligence for Sustainable Development

EDINA, University of Edinburgh

Scott Edmunds at OASP Asia: Open (and Big) Data – the next challenge

GigaScience, BGI Hong Kong

This document discusses challenges with traditional scholarly publishing and opportunities presented by open data and new publishing models. Traditional publishing incentives prioritize publications over data sharing, which hinders reproducibility and collaboration. This has led to a growing replication gap and increasing retractions. Open data approaches could help by rewarding data release and reuse. New publishing models are being developed to integrate data, analyses, and publications to better support reproducibility. Initiatives like GigaDB and GigaScience aim to "deconstruct" papers and provide incentives for open peer review, preprints, and implementing analyses in shared platforms like Galaxy. This represents an opportunity to address limitations of traditional publishing.

What can linked data do for digital libraries

Sören Auer

This document discusses research data management and support available from Jisc and the Digital Curation Centre (DCC). It provides background on policy drivers for research data management, outlines support offered by the DCC including capability studies, data management planning tools, and training. It also summarizes results from a 2014 survey of UK higher education institutions which found most progress in policy development and plans, but challenges around staffing, funding, and engagement of researchers. The document concludes with feedback on future priorities such as compelling services, engaging researchers, and shared infrastructure solutions.

Linked data and semantic wikis

Sören Auer

Research into Practice case study 2: Library linked data implementations an...

Hazel Hall

The document summarizes a presentation given by Dr. Diane Pennington and Laura Cagnazzo on library linked data implementations and perceptions. The presentation discussed the evolution of the semantic web and linked open data principles. It provided an overview of a study on the status and perceptions of linked data among European national libraries and Scottish libraries. The study found lack of awareness and expertise to be challenges for implementation. Benefits included improved data visibility and opportunities for collaboration. Recommendations focused on training, collaboration, and developing implementation guidelines and case studies.

Research Data Management: Policy Development

EDINA, University of Edinburgh

The University of Edinburgh approved a research data management policy in May 2011 to address growing pressures around research data. The Vice-Principal, Jeff Haywood, championed the development of the first research data management policy in the UK. The policy aimed to comply with funder requirements for open access to research data and address reputational risks around responding to Freedom of Information requests. In developing the policy, the university sought broad discussion, identified champions at various levels, and addressed gaps in research data services to support retention and access to data underlying published research.

Internet Archives and Social Science Research - Yeungnam University

mwe400

The document discusses using large datasets from the Internet Archive to conduct social science research on emerging organizational forms. It presents examples of previous research leveraging archive data on topics like natural disasters, political activity, and social movements. The author proposes analyzing hyperlink, news coverage, Twitter, and website data on the Occupy Wall Street movement to test hypotheses about its emerging networked structure over time. Results are presented showing the growth of the movement's online presence and core clusters within its organizational network.

"Plans are worthless, but planning is essential"

Research Data Alliance

The document discusses the need for data infrastructure to enable open sharing of data across boundaries. It describes infrastructure as relationships between people, technologies, and institutions. The Research Data Alliance (RDA) aims to build these relationships by developing standards and recommendations to serve as "gateways" that link different data systems. RDA works both globally through international coordination, and locally through regional groups to address issues at multiple levels simultaneously.

Promises and Pitfalls: Linked Data, Privacy, and Library Catalogs

Emily Nimsakont

This document discusses the promises and pitfalls of using linked data in library catalogs. It begins by explaining what linked data is and how it makes relationships between data explicit. Linked data initiatives like BIBFRAME aim to apply these concepts to library metadata. However, privacy is a major concern since linked data allows for more aggressive exploration of personal information. The document discusses libraries' role in protecting user privacy and explores solutions like privacy preference ontologies and standards from the W3C. Overall, while linked data holds benefits, ensuring user privacy will be an ongoing challenge for libraries to address.

An introduction to Linked (Open) Data

Ali Khalili

Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour

KNOWeSCAPE2014

This document discusses the potential for developing a knowledge network by leveraging metadata from scientific endeavors. It begins by outlining some of the limitations of traditional metadata approaches. It then proposes that metadata could be structured as a graph using semantic triples to represent relationships between people, institutions, projects, and other elements. This liberalized metadata approach could help reduce complexity while providing a more comprehensive view of scientific activities and outputs. The document advocates for establishing common standards, developing tools to extract and aggregate metadata, and creating services and repositories to enable discovery, analysis, and visualization of the knowledge network. The goal is to facilitate research by providing integrated access to information on scientific data, publications, actors and their relationships.

Overview of Open Data, Linked Data and Web Science

Haklae Kim

This document provides an overview of open data, linked data, and web science through conceptual discussions, case studies, and proposed next steps. It begins with definitions of key concepts like open data and the semantic web. Case studies demonstrate current applications of open data through government initiatives and technologies like Google's Knowledge Graph and Apple's Siri. The document concludes by acknowledging challenges with open data strategies and advocating for interdisciplinary collaboration to realize the potential of linked open government data.

WAPWG 16 Jan Thomson holdslide

Sara Day Thomson

The document discusses challenges and opportunities around preserving complex web content and social media. It provides examples of complex web objects like interactive stories and games that are difficult to archive using traditional tools. Social media poses additional problems like terms of service restrictions, personal data protection, and capturing dynamic conversations. However, there are also opportunities to prevent loss of cultural heritage, improve public services, and trial new preservation tools and methods. The event will include case studies on archiving interactive fiction, Twitter data for research, and web collections in museums.

New challenges for digital scholarship and curation in the era of ubiquitous ...

Derek Keats

Introduction to Scholarly Communication and the CSCDC

Center for Scholarly Communication & Digital Curation

Wire Workshop: Overview slides for ArchiveHub Project

mwe400

The document discusses using large datasets from the Internet Archive to conduct research. It outlines an agenda with three parts: large scale data, developing new tools, and testing and building theory. The Internet Archive contains over 10 petabytes of cultural data, including 410 billion archived web pages. The ArchiveHub project aims to create tools and guidelines for longitudinal research on archived web data. Examples of potential research topics are discussed, such as studying social movements using link and text data from websites about Occupy Wall Street. Challenges discussed include accessing and preparing the large datasets for research purposes and connecting the data to theoretical frameworks.

Partnering for Research Data

Liz Lyon

This document discusses partnering for research data and the various stakeholders involved. It identifies key stakeholder roles like directors, librarians, repository managers, and research support offices. Infrastructure requirements for delivering data management services are outlined, including tools for data plans, tracking impact, and more. There is a skills gap around research data that institutions are working to address through training and new specialist librarian roles in areas like data curation and management. International collaboration could help promote data literacy.

Enhancing the Web Experience

John Breslin

1. The document discusses DERI, a research institute focused on the semantic web and social semantic web. 2. It describes DERI's work developing the SIOC ontology to represent data from social websites on the semantic web in a standardized and interoperable way. 3. The SIOC ontology aims to connect users and data across different social/collaborative websites and allow users to easily move between sites while bringing their data.

SoBigData. European Research Infrastructure for Big Data and Social Mining

Research Data Alliance

Session 1.2 improving access to digital content by semantic enrichment

semanticsconference

This document discusses improving access to digital collections through semantic enrichment. It describes linking names and entities from text to knowledge bases like Wikidata to make the content more discoverable and usable. The process involves named entity recognition, entity linking using disambiguation algorithms, presenting enriched context, and enabling semantic search. User feedback is gathered to improve the linking algorithms through additional training. The goal is to increase trust in the links for research purposes. Overall, the approach aims to enrich text collections by connecting content to external information sources.

net neutrality

Thomas Lohninger

The document discusses net neutrality and the fragmentation of the internet. It notes that while many people just want to use technology without understanding how it works, this attitude is dangerous when it comes to the internet. The document then covers various topics related to net neutrality like internet layers, the end-to-end principle, innovation, the economy, culture, democracy, and different regulatory scenarios. It also discusses the stakeholders involved and campaigns to raise awareness of net neutrality issues.

Introduction to Linked Data 1/5

Open Research Problems in Linked Data - WWW2010

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Open Research Problems in Linked Data - WWW2010

Similar to Open Research Problems in Linked Data - WWW2010 (20)

More from Juan Sequeda

More from Juan Sequeda (8)

Recently uploaded

Recently uploaded (20)

Open Research Problems in Linked Data - WWW2010