Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Semantic-assisted Analysis and Search in Customer Specifications

594 views

Published on

Talk at Future Search Engines 2014 (FoRESEE), INFORMATIK2014 (http://www.informatik2014.de/)

Abstract (DE):
Die gezielte Suche von Informationen in großen Dokumentenmengen ist eine der wesentlichen Herausforderungen der heutigen Zeit. In diesem Papier wird beschrieben, wie wir die Analyse von und Suche in mehrsprachigen Kundenspezifikationen in einem aktuellen Kundenprojekt im Maschinenbau realisiert haben. Im Rahmen der Dokumentenanalyse kommen computerlinguistische und semantische Technologien zum Einsatz. Basis für die Suche bildet das Paradigma des Faceted Browsing.

Published in: Software
  • Be the first to comment

Semantic-assisted Analysis and Search in Customer Specifications

  1. 1. Semantic-assisted Analysis and Search in Customer Specifications Martin Voigt, Daniel Hladky September 2014 1 ONTOS LINKED DATA INFORMATION WORKBENCH Extraction & Analysis Indexing Information & Knowledge Management Search Engineer Storage Sales Portal Multilingual Specifications
  2. 2. I speakabout… The Problem, Our Solution, Insights & Further Work. 2
  3. 3. The Problem AviComp Controls GmbH  leading engineering contractor for rotating machinery controls 3 Customers Engineers Sales > 100k Technical Specifications http://www.avicomp.com/capabilities/turbo-compressor-controls.html
  4. 4. The Problem Analysis: 1) task, 2) current solution, 3) ideas Problems Multiple, inefficient tools Heterogeneity Knowledge management & transfer 4 http://answerhub.com/article/ the-cost-of-knowledge-loss/
  5. 5. Our Solution 5 ONTOS LINKED DATA INFORMATION WORKBENCH Extraction & Analysis Indexing Information & Knowledge Management Search Engineer Storage Sales Portal Multilingual Specifications http://www.ontos.com/products/ontosldiw/
  6. 6. Our Solution Extraction& Analysis Homogenization: PDF conversion (Apache POI) & OCR (CuneiForm) Text extraction (Apache Tika) Language detection (language-detection API) Text preparation, e.g., remove headers & footers SKOS-based concept identification 6 Lorem ipsum dolor sit amet, consetetursadipscing elitr, seddiamnonumyeirmodtemporinviduntutlaboreet doloremagna aliquyam erat, seddiamvoluptua. At veroeoset accusamet justoduo doloreset earebum. Stet clitakasdgubergren, no sea takimata sanctusestLorem ipsum dolor sit elitr, seddiamnonumyeirmodtemporinviduntutlaboreet doloremagna aliquyam erat, seddiamvoluptua. At veroeoset accusamet justoduo doloreset earebum. Stet clitakasdgubergren, no sea takimata elitr, seddiamnonumyeirmodtemporinviduntutlaboreet doloremagna aliquyam erat, seddiamvoluptua. At veroeoset accusamet justoduo doloreset earebum. Stet clitakasdgubergren, no sea takimata ONTOS LINKED DATA INFORMATION WORKBENCHExtraction & AnalysisIndexingInformation & Knowledge ManagementSearchEngineer Storage Sales Portal MultilingualSpecifications
  7. 7. Our Solution Storage via OntoQUAD  Triple and/or QuadStore, SPARQL 1.1, … Indexing  Full text search, result grouping, faceted browsing, SKOS-based label expansion, …  Apache Solr with lucene-skos plugin (https://github.com/behas/lucene-SKOS) 7 ONTOS LINKED DATA INFORMATION WORKBENCH Extraction & Analysis Indexing Information & Knowledge Management Search Engineer Storage Sales Portal Multilingual Specifications
  8. 8. Our Solution Knowledge Management via OntoDixbut SKOS-only 8 ONTOS LINKED DATA INFORMATION WORKBENCHExtraction & AnalysisIndexingInformation & Knowledge ManagementSearchEngineer Storage Sales Portal MultilingualSpecifications
  9. 9. Our Solution Search via AJAX Solr(https://github.com/evolvingweb/ajax-solr) 9 ONTOS LINKED DATA INFORMATION WORKBENCHExtraction & AnalysisIndexingInformation & Knowledge ManagementSearchEngineer Storage Sales Portal MultilingualSpecifications
  10. 10. Insights & Further Work Iterative development with early customer testing lowers usage barrier Lessons learned Development of a knowledge base Faceted search user interface Faceted search on RDF Multilingual disambiguationmechanisms 10
  11. 11. Q&A Martin Voigt Ontos AG / GmbH Nidau(CH) / Leipzig (DE) T:+49 341 21559-10 M:+49 178 40 222 58 E: martin.voigt@ontos.com 11
  12. 12. About Ontos 12 12 DoW – CTI Project Ontos Group Key Facts - Established 2001 - 15+ employees - Share in Eventos RU (30 people) - 5± Mio CHF turnover Industry - Media/News - Law Enforcement - Government - (Russia)

×