Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

0

Share

Download to read offline

TAIPAN: Automatic Property Mapping for Tabular Data

Download to read offline

TAIPAN: Automatic Property Mapping for Tabular Data by Ivan Ermilov and Axel-Cyrille Ngonga Ngomo in Proceedings of 20th International Conference on Knowledge Engineering and Knowledge Management (EKAW'2016)

Related Books

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

TAIPAN: Automatic Property Mapping for Tabular Data

  1. 1. TAIPAN: Automatic Property Mapping for Tabular Data by Ivan Ermilov and Axel-Cyrille Ngonga Ngomo November 22nd, 2016 1
  2. 2. Web Scale Data Mining from Web Tables Web Data Commons Dresden Table Dataset Other tables The Web TAIPAN ● Structured ● Schemaless ● Not using standards* ● SPARQL ● RDFS ● OWL 2
  3. 3. TAIPAN Approach Overview Identify Subject Column Atomize a Table Identify Property for Each Table Step 1 Step 2 Step 3 Step 4 Return Mappings 3
  4. 4. TAIPAN Approach Overview (example) 1 2 3 4
  5. 5. The Core of TAIPAN Subject Column Identification ● Unsupervised ML ● Structural features ● Semantic features ○ Support of a column ○ Connectivity ● Retrieve seed entities ● Rank entities ● Return top entity Property Mapping 5
  6. 6. Experimental setup For T2K: 128GB, 4 Cores, Ubuntu 14.04 For TAIPAN: 16GB, 4 Cores Ubuntu 14.04 Dataset 1: curated T2D gold standard (T2D) Dataset 2: DBpedia table dataset (DBD) 6
  7. 7. Subject Column Identification Experiments Rule-based approach achieves only 51.72% accuracy Using support and connectivity increase precision Observations Can be further improved using ML techniques 7
  8. 8. Property Mapping Experiments TAIPAN achieves better recall, but lower precision than T2D On the DBD dataset T2K could match only 1 property Observations Overall TAIPAN performs better than the state of the art 8
  9. 9. Conclusions & Future Work Curated T2D & DBD datasets Novel TAIPAN approach Open Table Extraction Table Extraction Benchmark (HOBBIT) Integration of TAIPAN into GEISER project 9
  10. 10. Thank you! Follow us on twitter :) Ivan Ermilov <iermilov@informatik.uni-leipzig.de> @hobbit_project 10

TAIPAN: Automatic Property Mapping for Tabular Data by Ivan Ermilov and Axel-Cyrille Ngonga Ngomo in Proceedings of 20th International Conference on Knowledge Engineering and Knowledge Management (EKAW'2016)

Views

Total views

690

On Slideshare

0

From embeds

0

Number of embeds

4

Actions

Downloads

4

Shares

0

Comments

0

Likes

0

×