Linked Science Semantic Web Value Proposition Scientometrics
Linked (Data) Scientometrics
Linked Science 2015 Keynote
Krzysztof Janowicz
STKO Lab, University of California, Santa Barbara, USA
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
What is Linked Science?
What is Linked Science?
Scientific dissemination traditionally relies heavily on scholarly ar-
ticles and presentations at conferences. However in the past
few years, we have seen an increasing trend towards the publi-
cation of raw research data to facilitate verification and reuse.
Linked Science champions the process of publishing, sharing
and interlinking scientific resources and data along with com-
plete experiment context, which is critical for understanding, reusing
and verifying scientific research. Semantic Web technologies pro-
vide a promising means for achieving this practice.
(From the Linked Science 2015 call)
What are the research questions of Linked Science, what are the bottlenecks?
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
What are Scientometrics?
What are Scientometrics?
The field of scientometrics is concerned with measuring and analyzing
the impact of science in its broadest sense.
(Raw) data by example
Publications
Authors
Affiliations
Keywords
Themes
Funding sources
Citations
...
What is meant by measuring and analyzing?
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
What are Scientometrics?
Scientometrics Research Questions
Research questions by example
Simple and boring
Number of papers at ISWC 2015
Boring
Number of Papers by a specific W. Zhang in 2015
Simple and interesting
What goes here?
Interesting
Is the Semantic Web as a research area growing or shrinking?
Are Linked Data and Semantic Web the same community?
Are the research interests of a researcher changing?
What are the new research trends in Artificial Intelligence?
To which university should I go to study geo-semantics?
Who are good reviewers for a certain paper?
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
What are Scientometrics?
Scientometrics Research Questions
Research questions by example
Simple and boring
Number of papers at ISWC 2015
Boring
Number of Papers by a specific W. Zhang in 2015
Simple and interesting
∅
Interesting
Is the Semantic Web as a research area growing or shrinking?
Are Linked Data and Semantic Web the same community?
Are the research interests of a researcher changing?
What are the new research trends in Artificial Intelligence?
To which university should I go to study geo-semantics?
Who are good reviewers for a certain paper?
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
What are Scientometrics?
Scientometrics Research Questions
Research questions by example
Simple and boring
Number of papers at ISWC 2015
Boring
Number of Papers by a specific W. Zhang in 2015
Should be Simple and interesting
How does a change in affiliations impact a researcher’s interests?
Is there a relation between spatial proximity and citations?
Interesting
Is the Semantic Web as a research area growing or shrinking?
Are Linked Data and Semantic Web the same community?
Are the research interests of a researcher changing?
What are the new research trends in Artificial Intelligence?
To which university should I go to study geo-semantics?
Who are good reviewers for a certain paper?
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
What are Scientometrics?
Whyare interesting scientometrics questions not simple?
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Retrieval
Key Limitations: Data Retrieval
Even the major data hubs such as Data.gov still rely on keyword-based search
and have unreliable, incomplete, and missing metadata. For this type of
retrieval problems, even ‘a little semantics goes a long way’ (Hendler 1997).
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Sensemaking
Key Limitations: Sensemaking and Fitness for Purpose
There is no shortage of data, but
finding data that is fit for a certain
purpose is difficult.
Data as statements not as truth,
e.g., according to Springer I am at
WSU not UCSB.
Heterogeneity is caused by cultural
differences, progress in science,
viewpoints, ...; e.g., associate
professor versus senior lecturer
Lack of provenance information
Sensemaking requires more
powerful semantic technologies and
ontologies (compared to IR).
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Interoperability
Key Limitations: Meaningful Analysis and Synthesis
Ensuring that data is analyzed and
combined in a meaningful way is far
from trivial.
What if the information on how to
use the data would come together
with these data?
Focus on smart data instead of
(merely on) smart applications.
The purpose of ontologies is not to
agree on the meaning of terms but to
make the data provider’s intended
meaning explicit.
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Smart Data
The Smart Data Argument
One of the key arguments underlying the Semantic Web and
Linked Data paradigms is to make data smart, not applications.
Instead of developing increasingly complex software, the
so-called business logic should be moved to the (meta)data.
The rationale is that smart data will make all future applications
more usable, flexible, and robust, while smarter applications
fail to improve data along the same dimensions.
(http://goo.gl/FMXOZT)
Why the Data Train Needs Semantic Rails. (2015) K. Janowicz, F. van Harmelen, J. Hendler, P. Hitzler
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Semantics-Enabled Linked-Data-Driven Scientometrics
Howdoes this relate to scientometrics?
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Semantics-Enabled Linked-Data-Driven Scientometrics
Semantics-Enabled Linked-Data-Driven Scientometrics
Integrates data from a variety of sources, e.g., Semantic Web Dog Food, SWJ.
Example: http://stko-exp.geog.ucsb.edu/lak/
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Semantics-Enabled Linked-Data-Driven Scientometrics
ISWC Installation Based on New Deployment Framework
http://scientometrics.geog.ucsb.edu/iswc/
Smart Data: first scientometrics installation (for SWJ) took months to develop and
deploy, now we are down to hours at least when leaving semantic lifting and data
cleaning aside (!) and by using a reduced number of modules (8/30)
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Semantics-Enabled Linked-Data-Driven Scientometrics
Value Proposition
Why do we use Semantic Web and Linked Data for Scientometrics
Federated queries over multiple data sources
Unique global identifiers easy conflation and deduplication
Transparent data model; reduces the need for guessing
No data silos, no API restrictions
Many pre-defined lightweight vocabularies (ontologies)
Smart data reduces the need for smart applications
Machine reasoning support
So do we still need a deeper knowledge representation beyond
surface semantics?
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Web@25
Web@25 Installation: Timeline
Keyword frequency for Semantic Web; WWW conference series (1994-2013)
http://stko-exp.geog.ucsb.edu/web25portal/index.html
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Web@25
Web@25 Installation: Timeline
Keyword frequency for Linked Data; WWW conference series (1994-2013)
http://stko-exp.geog.ucsb.edu/web25portal/
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
An Interesting Question
An Interesting Question
Given the keyword timeline, is the Semantic Web as a research field
disappearing, diversifying, radiating, ...?
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
An Interesting Question
Letthe data speak for themselves
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Web@25
Community detection
Colors: community membership, node size: frequency, line width: co-occurrence strength
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Web@25
Community detection
Colors: community membership, node size: frequency, line width: co-occurrence strength
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Web@25
Web@25 Installation: Self-organizing Map
Landscape analogy: counties, mountains, and valleys
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Web@25
Web@25 Installation: Self-organizing Map
Landscape analogy: counties, mountains, and valleys
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Web@25
Web@25 Installation: Mapping
Location of top institutions that published on Semantic Web between 2009-2013.
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Web@25
Web@25 Installation: Mapping
Similar pattern for Linked Data keyword between 2009-2013.
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
Web@25
Web@25 Installation: Mapping
Dissimilar pattern for Search Engine keyword between 2009-2013.
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
A Tale of Three Papers
Three Papers That Shaped the Semantic Web
Citations peaked 2009 for the Ontology and Semantic Web papers.
More interestingly, why would you still cite these papers today?
http://stko-testing.geog.ucsb.edu/ios/
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
A Tale of Three Papers
Three Papers That Shaped the Semantic Web
Top keywords: {Ontology, SW},{Semantic Web, Ontology}, {Linked Data, Semantic Web, Ontology}
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
A Tale of Three Papers
Three Papers That Shaped the Semantic Web
If a paper makes impact beyond its own home community, we should see an
increase in keyword variability (entropy).
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
And Now?
Sois the Semantic Web
disappearing, diversifying, radiating,...?
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
And Now?
Sois the Semantic Web
disappearing, diversifying, radiating,...?
No amount of data analytics is going to answer such questions before
we precisely define and communicate what we mean by those terms.
Linked Data Scientometrics K. Janowicz
Linked Science Semantic Web Value Proposition Scientometrics
And Now?
Where Do We Go From Here?
Using Linked Data, ontologies, and basic reasoning capabilities, allows us to
rapidly deploy scientometrics installations
Getting basic bibliographic data into (or as) Linked Data is becoming a
trivial task
Conflation, data enrichment, lack of rich metadata remains a major problem.
Discovering owl:sameAs links is just a subtask of conflation
Conflation race between academic publishers, libraries, ...
Generate and enrich the data where it is created or first processed
We need a rich but simple ontology that goes beyond academic publishing
but includes the related processes and roles
Revive Semantic Web Dog Food; ISWC really needs better metadata!
These slides and existing scientometrics systems are about embarrassingly
simple analysis, everything else will needs substantially stronger conceptual
models and machinery (combining inductive & deductive methods)
Linked Data Scientometrics K. Janowicz

Linked (Data) Scientometrics Keynote

  • 1.
    Linked Science SemanticWeb Value Proposition Scientometrics Linked (Data) Scientometrics Linked Science 2015 Keynote Krzysztof Janowicz STKO Lab, University of California, Santa Barbara, USA Linked Data Scientometrics K. Janowicz
  • 2.
    Linked Science SemanticWeb Value Proposition Scientometrics What is Linked Science? What is Linked Science? Scientific dissemination traditionally relies heavily on scholarly ar- ticles and presentations at conferences. However in the past few years, we have seen an increasing trend towards the publi- cation of raw research data to facilitate verification and reuse. Linked Science champions the process of publishing, sharing and interlinking scientific resources and data along with com- plete experiment context, which is critical for understanding, reusing and verifying scientific research. Semantic Web technologies pro- vide a promising means for achieving this practice. (From the Linked Science 2015 call) What are the research questions of Linked Science, what are the bottlenecks? Linked Data Scientometrics K. Janowicz
  • 3.
    Linked Science SemanticWeb Value Proposition Scientometrics What are Scientometrics? What are Scientometrics? The field of scientometrics is concerned with measuring and analyzing the impact of science in its broadest sense. (Raw) data by example Publications Authors Affiliations Keywords Themes Funding sources Citations ... What is meant by measuring and analyzing? Linked Data Scientometrics K. Janowicz
  • 4.
    Linked Science SemanticWeb Value Proposition Scientometrics What are Scientometrics? Scientometrics Research Questions Research questions by example Simple and boring Number of papers at ISWC 2015 Boring Number of Papers by a specific W. Zhang in 2015 Simple and interesting What goes here? Interesting Is the Semantic Web as a research area growing or shrinking? Are Linked Data and Semantic Web the same community? Are the research interests of a researcher changing? What are the new research trends in Artificial Intelligence? To which university should I go to study geo-semantics? Who are good reviewers for a certain paper? Linked Data Scientometrics K. Janowicz
  • 5.
    Linked Science SemanticWeb Value Proposition Scientometrics What are Scientometrics? Scientometrics Research Questions Research questions by example Simple and boring Number of papers at ISWC 2015 Boring Number of Papers by a specific W. Zhang in 2015 Simple and interesting ∅ Interesting Is the Semantic Web as a research area growing or shrinking? Are Linked Data and Semantic Web the same community? Are the research interests of a researcher changing? What are the new research trends in Artificial Intelligence? To which university should I go to study geo-semantics? Who are good reviewers for a certain paper? Linked Data Scientometrics K. Janowicz
  • 6.
    Linked Science SemanticWeb Value Proposition Scientometrics What are Scientometrics? Scientometrics Research Questions Research questions by example Simple and boring Number of papers at ISWC 2015 Boring Number of Papers by a specific W. Zhang in 2015 Should be Simple and interesting How does a change in affiliations impact a researcher’s interests? Is there a relation between spatial proximity and citations? Interesting Is the Semantic Web as a research area growing or shrinking? Are Linked Data and Semantic Web the same community? Are the research interests of a researcher changing? What are the new research trends in Artificial Intelligence? To which university should I go to study geo-semantics? Who are good reviewers for a certain paper? Linked Data Scientometrics K. Janowicz
  • 7.
    Linked Science SemanticWeb Value Proposition Scientometrics What are Scientometrics? Whyare interesting scientometrics questions not simple? Linked Data Scientometrics K. Janowicz
  • 8.
    Linked Science SemanticWeb Value Proposition Scientometrics Retrieval Key Limitations: Data Retrieval Even the major data hubs such as Data.gov still rely on keyword-based search and have unreliable, incomplete, and missing metadata. For this type of retrieval problems, even ‘a little semantics goes a long way’ (Hendler 1997). Linked Data Scientometrics K. Janowicz
  • 9.
    Linked Science SemanticWeb Value Proposition Scientometrics Sensemaking Key Limitations: Sensemaking and Fitness for Purpose There is no shortage of data, but finding data that is fit for a certain purpose is difficult. Data as statements not as truth, e.g., according to Springer I am at WSU not UCSB. Heterogeneity is caused by cultural differences, progress in science, viewpoints, ...; e.g., associate professor versus senior lecturer Lack of provenance information Sensemaking requires more powerful semantic technologies and ontologies (compared to IR). Linked Data Scientometrics K. Janowicz
  • 10.
    Linked Science SemanticWeb Value Proposition Scientometrics Interoperability Key Limitations: Meaningful Analysis and Synthesis Ensuring that data is analyzed and combined in a meaningful way is far from trivial. What if the information on how to use the data would come together with these data? Focus on smart data instead of (merely on) smart applications. The purpose of ontologies is not to agree on the meaning of terms but to make the data provider’s intended meaning explicit. Linked Data Scientometrics K. Janowicz
  • 11.
    Linked Science SemanticWeb Value Proposition Scientometrics Smart Data The Smart Data Argument One of the key arguments underlying the Semantic Web and Linked Data paradigms is to make data smart, not applications. Instead of developing increasingly complex software, the so-called business logic should be moved to the (meta)data. The rationale is that smart data will make all future applications more usable, flexible, and robust, while smarter applications fail to improve data along the same dimensions. (http://goo.gl/FMXOZT) Why the Data Train Needs Semantic Rails. (2015) K. Janowicz, F. van Harmelen, J. Hendler, P. Hitzler Linked Data Scientometrics K. Janowicz
  • 12.
    Linked Science SemanticWeb Value Proposition Scientometrics Semantics-Enabled Linked-Data-Driven Scientometrics Howdoes this relate to scientometrics? Linked Data Scientometrics K. Janowicz
  • 13.
    Linked Science SemanticWeb Value Proposition Scientometrics Semantics-Enabled Linked-Data-Driven Scientometrics Semantics-Enabled Linked-Data-Driven Scientometrics Integrates data from a variety of sources, e.g., Semantic Web Dog Food, SWJ. Example: http://stko-exp.geog.ucsb.edu/lak/ Linked Data Scientometrics K. Janowicz
  • 14.
    Linked Science SemanticWeb Value Proposition Scientometrics Semantics-Enabled Linked-Data-Driven Scientometrics ISWC Installation Based on New Deployment Framework http://scientometrics.geog.ucsb.edu/iswc/ Smart Data: first scientometrics installation (for SWJ) took months to develop and deploy, now we are down to hours at least when leaving semantic lifting and data cleaning aside (!) and by using a reduced number of modules (8/30) Linked Data Scientometrics K. Janowicz
  • 15.
    Linked Science SemanticWeb Value Proposition Scientometrics Semantics-Enabled Linked-Data-Driven Scientometrics Value Proposition Why do we use Semantic Web and Linked Data for Scientometrics Federated queries over multiple data sources Unique global identifiers easy conflation and deduplication Transparent data model; reduces the need for guessing No data silos, no API restrictions Many pre-defined lightweight vocabularies (ontologies) Smart data reduces the need for smart applications Machine reasoning support So do we still need a deeper knowledge representation beyond surface semantics? Linked Data Scientometrics K. Janowicz
  • 16.
    Linked Science SemanticWeb Value Proposition Scientometrics Web@25 Web@25 Installation: Timeline Keyword frequency for Semantic Web; WWW conference series (1994-2013) http://stko-exp.geog.ucsb.edu/web25portal/index.html Linked Data Scientometrics K. Janowicz
  • 17.
    Linked Science SemanticWeb Value Proposition Scientometrics Web@25 Web@25 Installation: Timeline Keyword frequency for Linked Data; WWW conference series (1994-2013) http://stko-exp.geog.ucsb.edu/web25portal/ Linked Data Scientometrics K. Janowicz
  • 18.
    Linked Science SemanticWeb Value Proposition Scientometrics An Interesting Question An Interesting Question Given the keyword timeline, is the Semantic Web as a research field disappearing, diversifying, radiating, ...? Linked Data Scientometrics K. Janowicz
  • 19.
    Linked Science SemanticWeb Value Proposition Scientometrics An Interesting Question Letthe data speak for themselves Linked Data Scientometrics K. Janowicz
  • 20.
    Linked Science SemanticWeb Value Proposition Scientometrics Web@25 Community detection Colors: community membership, node size: frequency, line width: co-occurrence strength Linked Data Scientometrics K. Janowicz
  • 21.
    Linked Science SemanticWeb Value Proposition Scientometrics Web@25 Community detection Colors: community membership, node size: frequency, line width: co-occurrence strength Linked Data Scientometrics K. Janowicz
  • 22.
    Linked Science SemanticWeb Value Proposition Scientometrics Web@25 Web@25 Installation: Self-organizing Map Landscape analogy: counties, mountains, and valleys Linked Data Scientometrics K. Janowicz
  • 23.
    Linked Science SemanticWeb Value Proposition Scientometrics Web@25 Web@25 Installation: Self-organizing Map Landscape analogy: counties, mountains, and valleys Linked Data Scientometrics K. Janowicz
  • 24.
    Linked Science SemanticWeb Value Proposition Scientometrics Web@25 Web@25 Installation: Mapping Location of top institutions that published on Semantic Web between 2009-2013. Linked Data Scientometrics K. Janowicz
  • 25.
    Linked Science SemanticWeb Value Proposition Scientometrics Web@25 Web@25 Installation: Mapping Similar pattern for Linked Data keyword between 2009-2013. Linked Data Scientometrics K. Janowicz
  • 26.
    Linked Science SemanticWeb Value Proposition Scientometrics Web@25 Web@25 Installation: Mapping Dissimilar pattern for Search Engine keyword between 2009-2013. Linked Data Scientometrics K. Janowicz
  • 27.
    Linked Science SemanticWeb Value Proposition Scientometrics A Tale of Three Papers Three Papers That Shaped the Semantic Web Citations peaked 2009 for the Ontology and Semantic Web papers. More interestingly, why would you still cite these papers today? http://stko-testing.geog.ucsb.edu/ios/ Linked Data Scientometrics K. Janowicz
  • 28.
    Linked Science SemanticWeb Value Proposition Scientometrics A Tale of Three Papers Three Papers That Shaped the Semantic Web Top keywords: {Ontology, SW},{Semantic Web, Ontology}, {Linked Data, Semantic Web, Ontology} Linked Data Scientometrics K. Janowicz
  • 29.
    Linked Science SemanticWeb Value Proposition Scientometrics A Tale of Three Papers Three Papers That Shaped the Semantic Web If a paper makes impact beyond its own home community, we should see an increase in keyword variability (entropy). Linked Data Scientometrics K. Janowicz
  • 30.
    Linked Science SemanticWeb Value Proposition Scientometrics And Now? Sois the Semantic Web disappearing, diversifying, radiating,...? Linked Data Scientometrics K. Janowicz
  • 31.
    Linked Science SemanticWeb Value Proposition Scientometrics And Now? Sois the Semantic Web disappearing, diversifying, radiating,...? No amount of data analytics is going to answer such questions before we precisely define and communicate what we mean by those terms. Linked Data Scientometrics K. Janowicz
  • 32.
    Linked Science SemanticWeb Value Proposition Scientometrics And Now? Where Do We Go From Here? Using Linked Data, ontologies, and basic reasoning capabilities, allows us to rapidly deploy scientometrics installations Getting basic bibliographic data into (or as) Linked Data is becoming a trivial task Conflation, data enrichment, lack of rich metadata remains a major problem. Discovering owl:sameAs links is just a subtask of conflation Conflation race between academic publishers, libraries, ... Generate and enrich the data where it is created or first processed We need a rich but simple ontology that goes beyond academic publishing but includes the related processes and roles Revive Semantic Web Dog Food; ISWC really needs better metadata! These slides and existing scientometrics systems are about embarrassingly simple analysis, everything else will needs substantially stronger conceptual models and machinery (combining inductive & deductive methods) Linked Data Scientometrics K. Janowicz