Deciding which RDF vocabulary terms to use when modeling data as Linked Open Data (LOD) is far from trivial. We propose "TermPicker" as a novel approach enabling vocabulary reuse by recommending vocabulary terms based on various features of a term. These features include the term’s popularity, whether it is from an already used vocabulary, and the so-called schema-level pattern (SLP) feature that exploits which terms other data providers on the LOD cloud use to describe their data. We apply Learning To Rank to establish a ranking model for vocabulary terms based on the utilized features. The results show that using the SLP-feature improves the recommendation quality by 29% to 36% considering the Mean Average Precision and the Mean Reciprocal Rank at the first five positions compared to recommendations based on solely the term’s popularity and whether it is from an already used vocabulary.
5. Vocabulary Term
Recommenda:ons Based on LOD
5
Recommender of vocabulary terms:{x1, ..., xn}
query input I
query-SLP: slpq = ({mo:SoloMusicArtist}, ?, ?)
Feature Computation
{F(slpq, x1), ..., F(slpq, xn)}{F(slpq, x1), ..., F(slpq, xn)}
II
Ranking Model
III
%({F(slpq, x1), ..., F(slpq, xn)})
query output IV
Classes for subject:
properties:
Clases for object:<..., mo:Record, mo:MusicGroup,...>
<..., mo:MusicArtist, foaf:Person,...>
<..., foaf:made,..., mo:member of,...>
6. Overview
6
Feature Computation
{F(slpq, x1), ..., F(slpq, xn)}{F(slpq, x1), ..., F(slpq, xn)}
Recommender of vocabulary terms:
Ranking Model
query input
query output IV
I
II
III
Classes for subject:
properties:
Clases for object:
query-SLP:
{x1, ..., xn}
%({F(slpq, x1), ..., F(slpq, xn)})
<..., mo:Record, mo:MusicGroup,...>
slpq = ({mo:SoloMusicArtist}, ?, ?)
<..., mo:MusicArtist, foaf:Person,...>
<..., foaf:made,..., mo:member of,...>
7. Feature Computa:on:
The SLP-Feature
7
slpq = ({mo:SoloMusicArtist}, {}, {})
If slpq ✓ slpi (slpi 2 SLPLOD)
Then Sets of recommendations: slpi slpq
Collabora;ve
Filtering
Classes for subject: < mo:MusicArtist, dbo:Actor >
Properties: < mo:member of, foaf:made, mo:recorded >
Classes for object: < mo:MusicBand, mo:Record >
SLPLOD = {({mo:SoloMusicArtist, mo:MusicArtist}, {mo:member of}, {mo:MusicBand})
({mo:SoloMusicArtist, dbo:Actor}, {foaf:made, mo:recorded}, {mo:Record})
({foaf:Person}, {foaf:knows}, {foaf:Person})
}
SLPLOD = SPLs computed from datasets on the LOD cloud
8. Feature Computa:on:
State of the Art Features1
8
Feature Definition of the Feature
f1 Number of datasets on the LOD cloud using the recommendation
candidate x
f2 Number of datasets on the LOD cloud using the vocabulary Vx of
recommendation candidate x
f3 Total number of occurrences of recommendation candidate x on the
LOD cloud
f4 Whether the recommendation candidate x is from a vocabulary that
is already used in query-SLP slpq
f1 f3: Reusing popular vocabularies/vocabulary terms
f4: Reusing vocabulary terms from the same vocabulary
1) Schaible, GoKron, and Scherp: Survey on Common Strategies of Vocabulary Reuse in Linked Open Data Modeling
(ESWC 2104)
9. Overview
9
Feature Computation
{F(slpq, x1), ..., F(slpq, xn)}{F(slpq, x1), ..., F(slpq, xn)}
Recommender of vocabulary terms:
Ranking Model
query input
query output IV
I
II
III
Classes for subject:
properties:
Clases for object:
query-SLP:
{x1, ..., xn}
%({F(slpq, x1), ..., F(slpq, xn)})
<..., mo:Record, mo:MusicGroup,...>
slpq = ({mo:SoloMusicArtist}, ?, ?)
<..., mo:MusicArtist, foaf:Person,...>
<..., foaf:made,..., mo:member of,...>