Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Generating Illustrative Snippets
for Open Data on the Web
Gong Cheng, Cheng Jin, Wentao Ding, Danyun Xu, Yuzhong Qu
Websof...
The Web is in the era of open data.
Dataset search engines have emerged.
Metadata about a dataset is served,
and only metadata is served.
We propose
to also serve an illustrative snippet,
Dataset:
A set of entity-property-value triples
Snippet:
A size-limited ...
and to serve a high-quality snippet.
• Coverage
To cover the most important entity types and properties.
• Familiarity
To ...
To this end, we formulate and solve a new
combinatorial optimization problem:
• Maximum-weight-and-coverage connected grap...
To this end, we formulate and solve a new
combinatorial optimization problem:
• Maximum-weight-and-coverage connected grap...
Experiment results
Baseline: PageRank-based snippet (Rietveld et al., ISWC’14)
Our snippet
Summary
• Motivation
• To help people quickly know the contents of a large dataset
• Our contribution
• We propose to auto...
Upcoming SlideShare
Loading in …5
×

Generating Illustrative Snippets for Open Data on the Web

234 views

Published on

Presented at WSDM '17, held at Cambridge, UK, on 07/02/2017.

  • Be the first to comment

  • Be the first to like this

Generating Illustrative Snippets for Open Data on the Web

  1. 1. Generating Illustrative Snippets for Open Data on the Web Gong Cheng, Cheng Jin, Wentao Ding, Danyun Xu, Yuzhong Qu Websoft Research Group National Key Laboratory for Novel Software Technology Nanjing University, China Websoft
  2. 2. The Web is in the era of open data.
  3. 3. Dataset search engines have emerged.
  4. 4. Metadata about a dataset is served,
  5. 5. and only metadata is served.
  6. 6. We propose to also serve an illustrative snippet, Dataset: A set of entity-property-value triples Snippet: A size-limited subset of triples Snippet generation
  7. 7. and to serve a high-quality snippet. • Coverage To cover the most important entity types and properties. • Familiarity To contain entities familiar to average users. • Cohesion To describe a set of related entities.
  8. 8. To this end, we formulate and solve a new combinatorial optimization problem: • Maximum-weight-and-coverage connected graph problem (MwcCG)
  9. 9. To this end, we formulate and solve a new combinatorial optimization problem: • Maximum-weight-and-coverage connected graph problem (MwcCG) CoverageFamiliarity Cohesion Quality of snippet
  10. 10. Experiment results Baseline: PageRank-based snippet (Rietveld et al., ISWC’14) Our snippet
  11. 11. Summary • Motivation • To help people quickly know the contents of a large dataset • Our contribution • We propose to automatically extract an optimal illustrative snippet pursuing coverage, familiarity, and cohesion. • We formulate a new combinatorial optimization problem: to maximize coverage & weights, constrained by graph connectivity. • We solve the problem using an approximation algorithm. • Paper • Gong Cheng, Cheng Jin, Wentao Ding, Danyun Xu, Yuzhong Qu. Generating Illustrative Snippets for Open Data on the Web. In Proc. WSDM ’17.

×