Das SemantischeDaten Web für UnternehmenVision, Technologie, AnwendungenSören AuerForschungsgruppe AKSW
Warum Semantic Web?Problem: Try to search for these things on the current Web:Apartments near German-Russian bilingual childcare in Leipzig.ERP service providers with offices in Vienna and London.Researchers working on multimedia topics in Eastern Europe.Informationis available on the Web, but opaque to current Web search.Solution: complement text on Web pages with structured linked open data & intelligently combine/integrate such structured information from different sources:Search engineHTMLHTMLRDFRDFWeb serverWeb serverWeb serverWeb serverleipzig.deHas everything about childcare in Potsdam.Immobilienscout.deKnows all about real estate offers in GermanyDBDB
Vom Web derDokumentezumSemantic Data WebSemantic Web(Vision 1998, starting ???)Reasoning
Logic, Rules
TrustData Web (since 2006)URI de-referencability
Web Data integration
RDF serializationsSocial Web (since 2003)Folksonomies/Tagging
Reputation, sharing
Groups, relationshipsWeb (since 1992)HTTP
HTML/CSS/JavaScriptThe Long Tail of Information DomainsPicturesThe Long Tail by Chris Anderson (Wired, Oct. ´04) adopted to information domainsRecipesNewsVideoCalendarPopularitySemWeb supported structured contentRequirements-EngineeringTalentmanagementSpecial interestcommunitiesItinerary ofKing GeorgeGenesequences…………Currently supportedstructuredcontent typesNot or insufficiently supported  content types
Die Vision: ein Web VernetzterDateninterlink20092007SILKDXX Enginefusecreate   2008poolpartySemMFOntoWiki2008SigmaWiQA20082008ORErepairclassifyVirtouso2009DL-LearnerMonetDBSindiceenrich
Semantic Web - StandardsStandardization Semantic Web1994First public presentation of the Semantic Web ideaSemantic Web ArchitectureStart of standardization of data model (RDF) and a first ontology languages (RDFS) at W3C1998Start of large research projects about ontologies in the US and Europe (DAML & Ontoknowledge)2000Current researchStart of standardization of a new ontology language (OWL) based on research results2002Finalization of the  standard for data (RDF) and ontology (OWL)2004Standardization of a quer y language(SPARQL, 6. April 2006)
Ongoing work on rule languages(SWRL, DL-safe rules, RIF)
Extension of OWL to OWL 1.1 / 2.0
Ontology language of OMG based on UML (ODM)2006Now standardizedRDFa2008OWL220096
Data Zugriff und Integration auf semantischerEbeneEnterprise Information Integrationsets of heterogeneous data sources appear as a single, homogeneous data sourceResearchMediatorsOntology-basedP2PWeb service-basedData WebURIs as entity identifiers
HTTP as data access protocol
Local-As-View (LAV)Data WarehousingBased on extract, transform load (ETL)
Global-As-View (GAV)Data IntegrationObject-relational mappings (ORM)NeXT’s EOF / WebObjects
ADO.NET Entity Framework
HibernateQuery LanguagesDatalog, SQL
SPARQL
XPATH/XQueryLinked Datade-referencable URIs
RDF serialization formatsProcedural  APIsODBC
JDBCData AccessTriple/Quad StoresRDF data model

Das Semantische Daten Web für Unternehmen

  • 1.
    Das SemantischeDaten Webfür UnternehmenVision, Technologie, AnwendungenSören AuerForschungsgruppe AKSW
  • 2.
    Warum Semantic Web?Problem:Try to search for these things on the current Web:Apartments near German-Russian bilingual childcare in Leipzig.ERP service providers with offices in Vienna and London.Researchers working on multimedia topics in Eastern Europe.Informationis available on the Web, but opaque to current Web search.Solution: complement text on Web pages with structured linked open data & intelligently combine/integrate such structured information from different sources:Search engineHTMLHTMLRDFRDFWeb serverWeb serverWeb serverWeb serverleipzig.deHas everything about childcare in Potsdam.Immobilienscout.deKnows all about real estate offers in GermanyDBDB
  • 3.
    Vom Web derDokumentezumSemanticData WebSemantic Web(Vision 1998, starting ???)Reasoning
  • 4.
  • 5.
    TrustData Web (since2006)URI de-referencability
  • 6.
  • 7.
    RDF serializationsSocial Web(since 2003)Folksonomies/Tagging
  • 8.
  • 9.
  • 10.
    HTML/CSS/JavaScriptThe Long Tailof Information DomainsPicturesThe Long Tail by Chris Anderson (Wired, Oct. ´04) adopted to information domainsRecipesNewsVideoCalendarPopularitySemWeb supported structured contentRequirements-EngineeringTalentmanagementSpecial interestcommunitiesItinerary ofKing GeorgeGenesequences…………Currently supportedstructuredcontent typesNot or insufficiently supported content types
  • 11.
    Die Vision: einWeb VernetzterDateninterlink20092007SILKDXX Enginefusecreate 2008poolpartySemMFOntoWiki2008SigmaWiQA20082008ORErepairclassifyVirtouso2009DL-LearnerMonetDBSindiceenrich
  • 12.
    Semantic Web -StandardsStandardization Semantic Web1994First public presentation of the Semantic Web ideaSemantic Web ArchitectureStart of standardization of data model (RDF) and a first ontology languages (RDFS) at W3C1998Start of large research projects about ontologies in the US and Europe (DAML & Ontoknowledge)2000Current researchStart of standardization of a new ontology language (OWL) based on research results2002Finalization of the standard for data (RDF) and ontology (OWL)2004Standardization of a quer y language(SPARQL, 6. April 2006)
  • 13.
    Ongoing work onrule languages(SWRL, DL-safe rules, RIF)
  • 14.
    Extension of OWLto OWL 1.1 / 2.0
  • 15.
    Ontology language ofOMG based on UML (ODM)2006Now standardizedRDFa2008OWL220096
  • 16.
    Data Zugriff undIntegration auf semantischerEbeneEnterprise Information Integrationsets of heterogeneous data sources appear as a single, homogeneous data sourceResearchMediatorsOntology-basedP2PWeb service-basedData WebURIs as entity identifiers
  • 17.
    HTTP as dataaccess protocol
  • 18.
    Local-As-View (LAV)Data WarehousingBasedon extract, transform load (ETL)
  • 19.
    Global-As-View (GAV)Data IntegrationObject-relationalmappings (ORM)NeXT’s EOF / WebObjects
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
    Virtuoso, Oracle, SesameRDBMSOrganizedata in relations, rows, cells
  • 27.
    Oracle, DB2, MS-SQLOthersXML,hierachical, tree, graph-oriented DBMSColumn-oriented DBMSCollocates column values rather than row values
  • 28.
    Vertica, C-Store, MonetDBDataModelsEntity-attribute-value (EAV)HELP medical record system, TrialDBLinked Data Web Technologie1. Nutzt RDF alsDatenmodel6.5.2010takesPlaceAtorganizesAKSWLSWT2010LeipzigtakesPlaceIn2. Istserialisiert in Triple:AKSW organizes LSWT2010LSWT2010 takesPlaceAt “20100506”^^xsd:dateLSWT2010 takesPlaceAt Leipzig3. Nutzt Content-negotiation
  • 29.
    RDF Vokabulare:Klassen &Eigenschaften HierarchienBeer rdf:typerdfs:ClassBottomFermentedBeerrdfs:subClassOf BeerBock rdfs:subClassOfBottomFermentedBeerLager rdfs:subClassOfBottomFermentedBeerPilsner rdfs:subClassOfBottomFermentedBeerhasContentrdf:typerdfs:PropertyhasAlcoholicContentrdfs:subPropertyOfhasContenthasOriginalWortContentrdfs:subPropertyOfhasContent9
  • 30.
    RDF-S InstanzenInstanzen sindeiner oder mehreren Klassen zugeordnet:Boddingtons rdf:type AleGrafentrunkrdf:type BockHoegaardenrdf:type WhiteJeverrdf:type Pilsner 10
  • 31.
  • 32.
    Integration von RDFund HTML: RDFa12<div typeof="foaf:Person" xmlns:foaf="http://xmlns.com/foaf/0.1/"> <p property="foaf:name"> Alice Birpemswick </p> <p> Email: <a rel="foaf:mbox"href="mailto:alice@exa.com">alice@exa.com</a> </p> <p> Phone: <a rel="foaf:phone"href="tel:+1-617-555-7332">+1 617.555.7332</a> </p></div>
  • 33.
    Anwendungs- und EinsatzpotentialeimUnternehmenIntegrationheterogenerInformationsbeständemittelsOntologien und Hintergrundwissen (z.B. DBpedia)Semantische Wikis (z.B. OntoWiki) helfenstrukturierteWissensbasenzuerstellen und managen
  • 34.
    Transformation von Wikipediain eineWissensbasiscommunity effort to extract structured information from Wikipedia and to make this information available on the Weballows to ask sophisticated queries against Wikipedia (e.g. universities in brandenburg, mayors of elevated towns, soccer players), and to link other data sets on the Web to Wikipedia dataRepresents a community consensusRecently launched DBpedia Live transforms Wikipedia into a structred knowledge baseS. Auer; C. Bizer, J. Lehmann, G. Kobilarov, R. Cyganiak, Z. Ives: DBpedia: A Nucleusfor a Web of Open Data. 6th International Semantic Web Conference ISWC 2007.S. Auer, J. Lehmann: Whathave Innsbruck and Leipzig in common? ExtractingSemanticsfrom Wiki Content. 4th European Semantic Web Conference, ESWC 2007.
  • 35.
    Structure in WikipediaTitleAbstractInfoboxesGeo-coordinatesCategoriesImagesLinksother language versionsother Wikipedia pagesTo the WebRedirectsDisambiguations
  • 36.
    Infobox templatesWikitext-Syntax{{Infobox Korean settlement| title = Busan Metropolitan City| img = Busan.jpg| imgcaption = A view of the [[Geumjeong]] district in Busan| hangul = 부산 광역시...| area_km2 = 763.46| pop = 3635389| popyear = 2006| mayor = Hur Nam-sik| divs = 15 wards (Gu), 1 county (Gun)| region = [[Yeongnam]]| dialect = [[Gyeongsang]]}}http://dbpedia.org/resource/Busandbp:Busan dbpp:title ″Busan Metropolitan City″dbp:Busan dbpp:hangul ″부산 광역시″@Hangdbp:Busan dbpp:area_km2 ″763.46“^xsd:floatdbp:Busan dbpp:pop ″3635389“^xsd:intdbp:Busan dbpp:region dbp:Yeongnamdbp:Busan dbpp:dialect dbp:Gyeongsang...RDF representation
  • 37.
    Einegroße multi-linguale, multi-domänenWissensbasisDBpediaExtraktionresultiertin:Beschreibungenvon ca. 3.4 MillionenDingen(1.5 million classified in a consistent ontology, including 312,000 persons, 413,000 places, 94,000 music albums, 49,000 films, 15,000 video games, 140,000 organizations, 146,000 species, 4,600 diseasesLabels und Zusammenfassungen in 92 verschiedenenSprachen; 1,460,000 links to images and 5,543,000 links to external web pages; 4,887,000 external links into other RDF datasets, 565,000 Wikipedia categories, and 75,000 YAGO categoriesZusammenmehrals1 MilliardeFakten(d.h. RDF triple): 257M from English edition, 766M from other language editionsDBpediahinterläßt sichtbareSpurenin Wissenschaft, Technologieand GesellschaftDBpedia became the central interlinking hub on the Data WebScientific publications attracted more than 500 citationsMore than 15.000 monthly visits on DBpedia.org,numerous press articles, blog posts …Ecosystem of commercial and community applications:ThomsonReuters, BBC, Neofonie, Openlink, Faviki…
  • 38.
    Das Semantische DatenWikiAgiles, verteiltes Knowledge EngineeringKeinWiki mitsemantischerErweiterung(Semantic MediaWiki, IkeWiki), sondern Ontology Editor der Wiki Konzeptenutzt:Make it easy tocorrect mistakes(ant intelligence)Activity can bewatched andreviewedEverything canbe undoneAKSW Vorstellung
  • 39.
  • 42.
    SoftWikiProblem: Requirements Engineeringmitgroßen, geografischverteilten Stakeholder-GruppenLösung:umfassendeOntologie für RE Wissen+ adaptierteOntoWikiAnwendungAnwendung von TextminingAlgorithmen für DuplicateDetection
  • 43.
  • 44.
    Take Home MessagesSemanticWebUnterstützt die Integration von Datenim Web (einheitliches Triple-Datenmodel)Standardisierte (W3C) Linked Data TechnologiebasisOntologien und Hintergrundwissen (z.B. DBpedia) hilftbeider Integration heterogenerInformationsbeständeSemantische Wikis helfen RDF Wissensbasenzuerstellen und managen
  • 45.
    Vielen Dank!Sören Auerauer@informatik.uni-leipzig.deAgileKnowledge Engineering & Semantic Web (AKSW)http://aksw.orgBerufsbegleitenderMasterstudiengang“Content- & Media Engineering”M1: Medienproduktion (GMP)M2: Web-Technologien (WT)M3: Content- und Wissensmanagement-Systeme (CWM)M4: Crossmediale Produktion (CP)M5: Medienwirtschaft und Medienmanagement (MW)M6: Projektarbeit (PA)M7: E-Business (EB)http://www.leipzigschoolofmedia.de/Mediencampus “Villa Ida”

Editor's Notes

  • #5 Popular content types such as pictures, movies, calendars, encyclopedic articles, news recipes etc. are already sufficiently well supported on the Web.However, there is a long tail of special-interest content (profiles of expertise, historic data and events, bio-medical knowledge, intra-corporational knowledge etc.) which has very low or no current support (for filtering, aggregation, searching, querying, collaborative editing) on the Web.