Semantic Web and Machine Learning Tutorial


Published on

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Semantic Web and Machine Learning Tutorial

  1. 1. Agenda Semantic Web and Machine Learning Tutorial • Introduction • Foundations of the Semantic Web • Ontology Learning • Learning Ontology Mapping • Semantic Annotation Steffen Staab Andreas Hotho ISWeb – Information Knowledge and Data Engineering Group • Using Ontologies Systems and Semantic Web University of Kassel University of Koblenz Germany • Applications Germany 2 Syntax is not enough Information Convergence • Convergence not just in devices, also in “information” – Your personal information (phone, PDA,…) Calendar, photo, home page, files… – Your “professional” life (laptop, desktop, … Grid) Web site, publications, files, databases, … – Your “community” contexts (Web) Hobbies, blogs, fanfic, social networks… • The Web teaches us that people will work to share – How do we CREATE, SEARCH, and BROWSE in the non-text Andreas based parts of our lives? • Tel • E-Mail 3 4
  2. 2. Meaning of Informationen: XML ≠ Meaning, XML = Structure (or: what it means to be a computer) name ναµε < name > education < education> <εδυχατιον> CV Χς < CV > work < work> <ωορκ> private < private > <πριϖατε> 5 6 Source of Problems (One) Layer Model of the Semantic Web XML is unspecific: No predetermined vocabulary No semantics for relationships & must be specified upfront Only possible in close cooperations – Small, reasonably stable group – Common interests or authorities Not possible in the Web or on a broad scale in general ! 7 8
  3. 3. Some Principal Ideas What is an Ontology? • URI – uniform resource identifiers Gruber 93: • XML – common syntax • Interlinked An Ontology is a • Layers of semantics – Tim Berners- formal specification ⇒ Executable from database to Lee, Weaving of a shared ⇒ Group of persons the Web knowledge base to proofs conceptualization ⇒ About concepts of a domain of interest ⇒ Between application and „unique truth“ Design principles of WWW applied to Semantics!! 9 10 Menu Menu Taxonomy Thesaurus Object Object Person Topic Document Person Topic Document Student Researcher Semantics Student Researcher Semantics Doctoral Student PhD Student F-Logic Ontology Doktoral Student PhD Student F-Logic Ontology synonym similar Taxonomy := Segmentation, classification and ordering of • Terminology for specific domain elements into a classification system according to their • Graph with primitives, 2 fixed relationships (similar, synonym) relationships between each other • originate from bibliography 11 12
  4. 4. Menu Topic Map Ontology (in our sense) Object is_a Object knows described_in Person Topic Document knows described_in Person Topic Document is_a writes writes Student Researcher Semantics F-Logic Ontology Student Researcher Semantics is_a subTopicOf similar Affiliation Doktoral Student PhD Student PhD Student PhD Student F-Logic Ontology Rules Doktoral Student PhD Student F-Logic Ontology instance_of T described_inD similar T is_about D Tel Affiliation synonym similar Tel Affiliation York Sure P writes D is_about T P knows T +49 721 608 6592 AIFB • Topics (nodes), relationships and occurences (to documents) • ISO-Standard • Representation Language: Predicate Logic (F-Logic) • typically for navigation- and visualisation • Standards: RDF(S); coming up standard: OWL 13 14 The Semantic Web What’s in a link? Formally cooperatesWith cooperatesWith Ontology rdfs:Domain rdfs:Range Person Person W3C recommendations rdfs:subClass Employee Employee • RDF: an edge in a graph rdfs:subClass PostDoc rdfs:subClass • OWL: consistency (+subsumption+classif. + …) PostDoc Professor Professor rdf:type rdf:type <swrc:PostDoc rdf:ID="person_sha"> <swrc:name>Siegfried <swrc:Professor Currently under discussion • Rules: a deductive database Handschuh</swrc:name> rdf:ID="person_sst"> Meta- <swrc:cooperatesWith rdf:resource = <swrc:name>Steffen Staab </swrc:name> data " #person_sst"/> ... </swrc:Professor> swrc:cooperatesWith ... Currently under intense research </swrc:PostDoc> Web • Proof: worked-out proofs page • Trust: signature & everything working together 15 16 URL
  5. 5. What’s in a link? Informally Ontologies and their Relatives (I) • There are many relatives around: • RDF: pointing to shared data • OWL: shared terminology – Controlled vocabularies, thesauri and classification systems available in the WWW, see • Classification Systems (e.g. UNSPSC, Library Science, etc.) • Thesauri (e.g. Art & Architecture, Agrovoc, etc.) • Rules: if-then-else conditions • DMOZ Open Directory – Lexical Semantic Nets • WordNet, see • Proof: proof already shown • EuroWordNet, see – Topic Maps, (e.g. used within knowledge • Trust: reliability management applications) • In general it is difficult to find the border line! 17 18 Ontologies and their Relatives (II) Ontologies - Some Examples • General purpose ontologies: – WordNet / EuroWordNet, – The Upper Cyc Ontology, General – IEEE Standard Upper Ontology, Formal logical • Domain and application-specific ontologies: Thesauri Is-a Frames constraints – RDF Site Summary RSS, Catalog / ID – UMLS, – GALEN – SWRC – Semantic Web Research Community: – RETSINA Calendering Agent, – Dublin Core, Terms/ Informal Formal Value • Web Services Ontologies Is-a Axioms – Core ontology of services Glossary Instance Restric- Disjoint – Web Service Modeling ontology tions Inverse – DAML-S • Meta-Ontologies Relations, – Semantic Translation, ... – RDFT, – Evolution Ontology, • Ontologies in a wider sense – Agrovoc, – Art and Architecture, – UNSPSC, – DTD standardizations, e.g. HR-XML, 19 20
  6. 6. Tools for markup... Not tied to specific domains PhotoStuff Demo 21 22 Not tied to specific domains Shared Workspace (Xarop + Screenshot) Shape Visual VDE plug-in Shape erasure Descriptor launch selection selection Save Shape Color selection Descriptor Prototype extraction Instances Domain Ontology Browser Selected region Draw panel M-OntoMat is publicly available 23 24
  7. 7. Social networks: Coming sooner than you may think… e.g. Friend of a Friend (FOAF) • Say stuff about yourself (or others) in OWL files, link to who you “know” 25 Estimates of the number of Foaf users range from 2M-5M 26 Using FOAF in other contexts Get a B&N price (In Euros) Jennifer Golbeck 27 28
  8. 8. Of a particular book In its German edition? 29 30 The Semantic Wave YOU ARE HERE 2005 YOU ARE HERE 2003 (Berners-Lee, 03) 31 32
  9. 9. Now. The semantic web and machine learning • RDF, RDFS and OWL are ready for prime time What can machine learning do for What can the Semantic Web do the Semantic Web? for Machine Learning? – Designs are stable, implementations maturing • Major Research investment translating into application 1. Learning Ontologies 1. Lots and lots of tools to development and commercial spinoffs (even if not fully automatic) describe and exchange data 2. Learning to map between for later use by machine – Adobe 6.0 embraces RDF learning methods in a ontologies – IBM releases tools, data and partnering 3. Deep Annotation: Reconciling canonical way! – HP extending Jena to OWL databases and ontologies 2. Using ontological structures – OWL Engines by Ontoprise GmbH, Network Inference, Racer GmbH 4. Annotation by Information to improve the machine Extraction learning task – Proprietary OWL ontologies for vertical markets 5. Duplicate recognition 3. Provide background • c.f. pharmacology, HMO/health care, ... Soft drinks knowledge to guide machine – Several new starts in SW space learning 33 34 Foundations of the Semantic Web: References Agenda • Semantic Web Activity at W3C • (currently relaunched) • Introduction • Journal of Web Semantics • D. Fensel et al.: Spinning the Semantic Web: Bringing the World Wide Web to Its Full • Foundations of the Semantic Web Potential, MIT Press 2003 • G. Antoniou, F. van Harmelen. A Semantic Web Primer, MIT Press 2004. • Ontology Learning • S. Staab, R. Studer (eds.). Handbook on Ontologies. Springer Verlag, 2004. • • S. Handschuh, S. Staab (eds.). Annotation for the Semantic Web. IOS Press, 2003. International Semantic Web Conference series, yearly since 2002, LNCS • Learning Ontology Mapping • World Wide Web Conference series, ACM Press, first Semantic Web papers since 1999 • Semantic Annotation • York Sure, Pascal Hitzler, Andreas Eberhart, Rudi Studer, The Semantic Web in One Day, IEEE Intelligent Systems, • Using Ontologies • Applications • Some slides have been stolen from various places, from Jim Hendler and Frank van Harmelen, in particular. 35 36
  10. 10. The OL Layer Cake How do people acquire taxonomic knowledge? • I have no idea! ∀x, y (married ( x, y ) → love( x, y )) Rules • But people apply taxonomic reasoning! Relations – „Never do harm to any animal!“ cure(dom:DOCTOR,range:DISEASE) => „Don‘t do harm to the cat!“ is_a(DOCTOR,PERSON) Concept Hierarchies • More difficult questions: DISEASE:=<I,E,L> Concepts – representation – reasoning patterns {disease,illness} Synonyms • But let‘s speculate a bit! ;-) disease, illness, hospital Terms 37 38 How do people acquire taxonomic knowledge? How do people acquire taxonomic knowledge? What is liver cirrhosis? What is liver cirrhosis? Diseases such as liver cirrhosis are Mr. Smith died from liver cirrhosis. difficult to cure. (New York Times) Mr. Jagger suffers from liver cirrhosis. Alcohol abuse can lead to liver cirrhosis. =>prob(isa(liver cirrhosis,disease)) 39 40
  11. 11. How do people acquire taxonomic knowledge? Evaluation of Ontology Learning The apriori approach is based on a gold standard ontology: – Given an ontology modeled by an expert -> The so called gold standard What is liver cirrhosis? – Compare the learned ontology with the gold standard Cirrhosis: noun[uncountable] • Which methods exists: serious disease of the liver, – learning accuracy/precision/recall/f-measure often caused by drinking too – Count edges in the “ontology graph” • Counting of direct relation only (Reinberger 2005) much alcohol • Least common superconcept • Semantic cotopy • … liver cirrhosis ≈ cirrhosis ∧ isa(cirrhosis, disease) – Evaluation via application (cf. section using ontologies) → prob(isa(liver cirrhosis, disease)) 41 42 The Semantic Cotopy Example for SC bookable root SC (c, O) = {c' | c' ≤ O c ∨ c ≤ O c'} rentable joinable thing activity driveable appartment excursion trip vehicle appartment excursion trip rideable car TWV car bike bike [Maedche & Staab 02] SC(bike)={bike,rideable,driveable.rentable,bookable} SC(bike)={bike,TWV,vehicle,thing,root} 43 => TO(bike,O1,O2)=1/9!!! 44
  12. 12. Common Semantic Cotopy Example for SC‘ bookable root SC ' (c, O1 , O2 ) = {c' | c'∈ C1 ∩ C2 ∧ (c' ≤ O1 c ∨ c ≤ O1 c' )} rentable joinable thing activity driveable appartment excursion trip vehicle appartment excursion trip rideable car TWV car bike bike SC‘(driveable)={bike,car} SC‘(vehicle)={bike,car} 45 => TO(driveable,O1,O2)=1 46 One more Example Semantic Cotopy Revisited (Once More) root SC ' ' (c, O1 , O2 ) = {c' | c'∈ C1 ∩ C2 ∧ (c' > O1 c ∨ c < O1 c' )} thing activity car bike apartment excursion trip vehicle appartment excursion trip 1 TWV car TO (O1 , O2 ) = ∑ TO(c, O1 , O2 ) | C1 | c∈C1 ,∉C2 bike SC‘(car)={car} SC‘(vehicle)={bike,car} => TO(driveable,O1,O2)=1/2 47 48
  13. 13. Example for Precision/Recall Example for Precision/Recall P=100% P=100% bookable root bookable root rentable joinable thing activity rentable joinable thing activity driveable appartment excursion trip vehicle appartment excursion trip driveable appartment excursion trip vehicle appartment excursion trip rideable car TWV car bike car TWV car bike bike F=100% R=87,5% bike F=93.33% R=100% 49 50 Example for Precision/Recall Another Example P=90% P=100% bookable root root rentable joinable thing activity thing activity car bike apartment excursion trip driveable appartment planable trip vehicle appartment excursion trip vehicle appartment excursion trip rideable car excursion TWV car TWV car bike bike F=94.74% bike F=57.14% R=40% R=100% 51 52
  14. 14. Evaluation Methodology Lexical Recall and F‘ 1 TO (O1 , O2 ) = ∑ TO(c, O1, O2 ) | C1 | c∈C1 | CO1 ∩ CO2 | ⎧ TO ' (c, O1 , O2 ) if c ∈ C2 LR (O1 , O2 ) = TO (c, O1 , O2 ) = ⎨ | CO2 | ⎩TO ' ' (c, O1 , O2 ) if c ∉ C2 | SC (c, O1 , O2 ) ∩ SC (c, O2 , O1 ) | 2 * F (O1 , O2 ) * LR (O1 , O2 ) TO ' (c, O1 , O2 ) := F ' (O1 , O2 ) = | SC (c, O1 , O2 ) ∪ SC (c, O2 , O1 ) | ( F (O1 , O2 ) + LR (O1 , O2 )) | SC (c, O1 , O2 ) ∩ SC (c' , O2 , O1 ) | TO ' ' (c, O1 , O2 ) := max c '∉C2 | SC (c, O1 , O2 ) ∪ SC (c' , O2 , O1 ) | P (O1 , O 2 ) = TO (O1 , O2 ) R (O1 , O 2 ) = TO (O2 , O1 ) 2 ⋅ P(O1 , O2 ) ⋅ R (O1 , O2 ) F (O1 , O2 ) = P (O1 , O2 ) + R (O1 , O2 ) 53 54 Evaluation of Ontology Learning Starting Point in OL from text • The aposteriori Approach: • Context-based approaches: – ask domain expert for a per concept evaluation of the learned – Distributional Hypothesis [Harris 85]: ontology „Words are (semantically) similar to the – Count three categories of concepts: extent to which they appear in similar (syntactic) contexts“ • Correct : both in learned and the gold ontology – leads to creation of groups • New : only in learned ontology, but relevant and should be in gold standard as well • Spurious: useless • Looking for explicit information: – Compute precision = (correct + new) / (correct + new + – Texts spurious) – WWW • As the result: – Thesauri The a priori evaluations are aweful – BUT A posteriori evaluations by domain experts still show very good results, very helpful for domain expert! Sabou M., Wroe C., Goble C. and Mishne G.,Learning Domain Ontologies for Web Service Descriptions: an Experiment in Bioinformatics, In Proceeedings of the 14th International World Wide Web Conference (WWW2005), 55 56 Chiba, Japan, 10-14 May, 2005.
  15. 15. Looking for explicit information Pattern based approaches (Hearst Patterns) There are two sources: • Match patterns in corpus: • NP0 such as NP1 ... NPn-1 (and|or) NPn • such NP0 as NP1 ... NPn-1 (and|or) NPn • Looking for patterns in texts: • NP1 ... NPn (and|or) other NP0 – ‚is-a‘ patterns [Hearst 92,98],[Poesio et al. 02], [Ahmid et al. 03] • NP0, (including,especially) NP1 ... NPn-1 (and|or) NPn – ‚part-of‘ patterns [Charniak et al. 99] – ‚causation‘ patterns [Girju 02/03] for all NPi 1 ≤ i ≤ n isa Hearst (head(NPi ), head(NP0 )) # HearstPatterns(t1 , t 2 ) isa Hearst (t1 , t 2 ) = # HearstPatterns(t1 ,*) • Using the Web: – [Etzioni et al. 04] • isaHearst(conference,event)=0.44 • isaHearst(conference,body)=0.22 – [Cimiano et al. 04] • isaHearst(conference,meeting)=0.11 • isaHearst(conference,course)=0.11 • isaHearst(conference,activity)=0.11 57 58 WWW Patterns The Vector-Space Model Generate patterns: • Idea: collect context information based on the • <t1>s such as <t2> distributional hypothesis and represent it as a • such <t1>s as <t2> vector: • <t1>s, especially <t2> • <t1>s, including <t2> • <t2> and other <t2>s die_from suffer_from enjoy eat • <t2> or other <t2>s disease X X and Query the Web using the GoogleAPI: cirrhosis X X # Patterns(t1 , t 2 ) • compute similarity among vectors isa WWW (t1 , t 2 ) = # Patterns(t1 ,*) wrt. to some measure 59 60
  16. 16. Clustering Concept Hierarchies from Text Context Extraction • Observation: ontology engineers need information about • extract syntactic dependencies from text the effectiveness, efficiency and trade-offs of different ⇒ verb/object, verb/subject, verb/PP relations approaches ⇒ car: drive_obj, crash_subj, sit_in, … • LoPar, a trainable statistical left-corner parser: • Similarity-based – agglomerative/bottom-up – divisive/top-down: Bi-Section-KMeans Parser tgrep Lemmatizer Smoothing • Set-theoretical – set operations (inclusion) – FCA, based on Galois lattices Lattice FCA Pruning Weighting Compaction [Cimiano et al. 03-04] 61 62 Example Weighting (threshold t) • People book hotels. The man drove the bike • Conditional: P(n | varg ) along the beach. ⎛ P(n | varg ) ⎞ • Hindle: P (n | varg ) ⋅ log⎜ ⎜ P ( n) ⎟ ⎟ book_subj(people) ⎝ ⎠ book_subj(people) book_obj(hotels) book_obj(hotel) drive_subj(man) ⎛ P(n | varg ) ⎞ drove_subj(man) • Resnik: S R (varg ) ⋅ P(n | varg ) ⋅ log⎜ ⎜ P ( n) ⎟ ⎟ drove_obj(bike) Lemmatization drive_obj(bike) ⎝ ⎠ drove_along(beach) drive_along(beach) ⎛ P(n' | varg ) ⎞ S R (varg ) = ∑ P(n' | varg ) ⋅ log⎜⎜ P ( n' ) ⎟ ⎟ n' ⎝ ⎠ 63 64
  17. 17. Tourism Formal Context Tourism Lattice bookable rentable driveable rideable joinable appartment X X car X X X motor-bike X X X X excursion X X trip X X 65 66 Concept Hierarchy Agglomerative/Bottom-Up Clustering bookable rentable joinable driveable appartment excursion trip rideable car car bus appartment excursion trip bike 67 68
  18. 18. Linkage Strategies Bi-Section-KMeans • Complete-Linkage: – consider the two most dissimilar elements of each of the clusters car appartment bus => O(n2 log(n)) trip excursion • Average-Linkage: – consider the average similarity of the elements in the clusters appartment excursion => O(n2 log(n)) car trip bus • Single-Linkage: – consider the two most similar elements of each of the clusters => O(n2) bus car appartment excursion trip bus car 69 70 Data Sets Results Tourism Domain • Tourism (118 Mio. tokens): – – – British National Corpus (BNC) – handcrafted tourism ontology (289 concepts) • Finance (185 Mio. tokens): – Reuters news from 1987 – GETESS finance ontology (1178 concepts) 71 72
  19. 19. Results in Finance Domain Results Tourism Domain 73 74 Results in Finance Domain Summary Effectiveness Efficiency Traceability FCA 43.81/41.02% O(2n) Good Agglomerative 36.78/33.35% O(n2 log(n)) Fair Clustering 36.55/32.92% O(n2 log(n)) 38.57/32.15% O(n2) Divisive 36.42/32.77% O(n2) Weak-Fair Clustering 75 76
  20. 20. Other Clustering Approaches Ontology Learning References • Bottom-Up/Agglomerative • Reinberger, M.-L., & Spyns, P. (2005). Unsupervised text mining for the learning of dogma-inspired ontologies. In Buitelaar, P., Cimiano, P., & Magnini, B. (Eds.), Ontology Learning from Text: Methods, Evaluation and Applications. • Philipp Cimiano, Andreas Hotho, Steffen Staab: Comparing Conceptual, Divise and Agglomerative Clustering for Learning Taxonomies from Text. ECAI – (ASIUM System) Faure and Nedellec 1998 • 2004: 435-439 P. Cimiano, A. Pivk, L. Schmidt-Thieme and S. Staab, Learning Taxonomic Relations from Heterogenous Evidence. In Buitelaar, P., Cimiano, P., & – Caraballo 1999 Magnini, B. (Eds.), Ontology Learning from Text: Methods, Evaluation and Applications. • Sabou M., Wroe C., Goble C. and Mishne G.,Learning Domain Ontologies for Web Service Descriptions: an Experiment in Bioinformatics, In Proceeedings of the 14th International World Wide Web Conference (WWW2005), Chiba, Japan, 10-14 May, 2005. – (Mo‘K Workbench) Bisson et al. 2000 • Alexander Maedche, Ontology Learning for the Semantic Web, PhD Thesis, Kluwer, 2001. • Alexander Maedche, Steffen Staab: Ontology Learning for the Semantic Web. IEEE Intelligent Systems 16(2): 72-79 (2001) • Alexander Maedche, Steffen Staab: Ontology Learning. Handbook on Ontologies 2004: 173-190 • Other: • M. Ciaramita, A. Gangemi, E. Ratsch, J. Saric, I. Rojas. Unsupervised Learning of semantic relations between concepts of a molecular biology ontology. IJCAI, 659ff. – Hindle 1990 • A. Schutz, P. Buitelaar. RelExt: A Tool for Relation Extraction from Text in Ontology Extension. ISWC 2005. – Pereira et al. 1993 • Faure, D., & N´edellec, C. (1998). A corpus-based conceptual clustering method for verb frames and ontology. In Velardi, P. (Ed.), Proceedings of the LREC Workshop on Adapting lexical and corpus resources to sublanguages and applications, pp. 5–12. – Hovy et al. 2000 • Michele Missikoff, Paola Velardi, Paolo Fabriani: Text Mining Techniques to Automatically Enrich a Domain Ontology. Applied Intelligence 18(3): 323-340 (2003). • Gilles Bisson, Claire Nedellec, Dolores Cañamero: Designing Clustering Methods for Ontology Building - The Mo'K Workbench. ECAI Workshop on Ontology Learning 2000 77 78 Lots of Overlapping Ontologies Agenda on the Semantic Web • Introduction • Foundations of the Semantic Web Search Swoogle • Ontology Learning for “publication” • Learning Ontology Mapping 185 matches in • Semantic Annotation the repository • Using Ontologies Different • Applications definitions, viewpoints, notions 79 80 © Noy
  21. 21. Creating Correspondences Between Ontologies 81 82 © Noy Ontology-to-Ontology Mappings: Ontology-level Mismatches Sources of information • The same terms describing different concepts • Different terms describing the same concept • Lexical information: edit distance, … • Different modeling paradigms • Ontology structure: subclassOf, instanceOf,… – e.g., intervals or points to describe temporal aspects • User input: “anchor points” • Different modeling conventions • External resources: WordNet,… • Different levels of granularity • Prior matches • Different coverage • Different points of view • ... 83 84 © Noy © Noy
  22. 22. Mapping Methods Example Thing simLabel = 0.0 Vehicle simSuper = 1.0 • Heuristic and Rule-based methods simInstance = 0.9 1.0 Automobile hasSpecification simRelation = 0.9 • Graph analysis simAggregation = 0.7 Speed Object Marc’s Porsche fast • Probabilistic approaches 0.7 Vehicle hasOwner 0.9 • Reasoning, theorem proving Boat Owner Car 0.9 • Machine-learning hasSpeed Speed Marc Porsche KA-123 250 km/h 85 86 Mapping Methods GLUE: Defining Similarity A,S Assoc. Prof Snr. Lecturer ¬A, S • Heuristic and Rule-based methods A,¬S Hypothetical • Graph analysis Common Marked up domain • Probabilistic approaches ¬A,¬S • Reasoning, theorem proving P(A ∩ S) P(A,S) Sim(Assoc. Prof., Snr. Lect.) = = [Jaccard, 1908] P(A ∪ S) P(A,¬S) + P(A,S) + P(¬A,S) • Machine-learning Joint Probability Distribution: P(A,S),P(¬A,S),P(A,¬S),P(¬A,¬S) Multiple Similarity measures in terms of the 87 JPD 88
  23. 23. GLUE: No common data instances Machine Learning for computing similarities ¬A,¬S United States ¬A,S A,¬S Australia ¬A,¬S In practice, not easy to find data tagged with both A S ontologies ! A S ¬S ¬A ¬S ¬A A,¬S A,S A,S ¬A,S United States Australia A S CLA CLS Solution: Use Machine Learning ¬A ¬S JPD estimated by counting the sizes of the partitions 89 90 GLUE: Improve Predictive Accuracy – Use Multi- Strategy Learning GLUE Next Step: Exploit Constraints Single Classifier cannot exploit all available information • Constraints due to the taxonomy structure Combine the prediction of multiple classifiers Parents People Staff Staff A Meta-Learner Staff Fac Acad Tech CLA1 A Children ¬A Prof Assoc. Prof Asst. Prof Prof Snr. Lect. Lect. … A ¬A CLAN ¬A • Domain specific constraints – Department-Chair can only map to a unique concept Content Learner Frequencies on different words in the text in the data instances Name Learner • Numerous constraints of different types Words used in the names of concepts in the taxonomy Extended Relaxation Labeling to ontology matching Others … 91 92
  24. 24. Putting it all together GLUE System APFEL: Similarity Features Mappings for O1 , Mappings for O2 Feature Similarity Measure Concepts label String Similarity Relaxation Labeler subclassOf Set Similarity Generic & Domain Similarity Matrix constraints instances Set Similarity Similarity Estimator … Similarity function Relations Joint Distributions:P(A,B),P(A,¬B),… Instances Meta Learner Distribution Estimator Distribution Aggregation - Example: sim(e, f ) = ∑ wk simk (e, f ) Learner CL1 Learner CLN Estimator k Taxonomy O1 Taxonomy O2 Interpretation: map(e1j) = e2j ← sim(e1j ,e2j)>t 93 94 (structure + data instances) (structure + data instances) APFEL: Optimize Integration Duplicate Recognition Iterations Generation Of Initial Pair Features Ontologies Entity Alignments User x Similarity Training: Aggregation Optimized Interpretation Selection Validation Feature/Similarity Alignment Weighting Scheme Method and Threshold Simple Generation of Alignment Feature/Similarity Fixing • Do two objects refer to the same entity? Input Output Method Hypotheses – We know objects have the same type (their types are mapped/merged) • Examples – Duplicate removal after merging knowledge bases – Citation matching 95 96 © Noy
  25. 25. Using External Sources for Duplicate Recognition Duplicate Recognition: Citation Matching Appolo (USC/ISI) Pasula, Marthi, (UC Berkeley) – Combines information- – Performs citation matching based on probability integration mediator models for (Prometheus) with a record- • author names linkage system (Active Atlas) • titles – Uses a domain model of sources and information that • title corruption, etc. they provide – Extends standard domain model to incorporate probabilities – Learns probability models from large data sets 97 98 © Noy © Noy References Agenda User Input driven - Prompt, Chimaera, ONION Chimaera (Stanford KSL; D. McGuinness et al) • Introduction AnchorPrompt (Stanford SMI; Noy, Musen et al) • Foundations of the Semantic Web Similarity Flooding (Melnik, Garcia-Molina, Rahm) • Ontology Learning IF-Map (Kalfoglou, Schorlemmer) • Learning Ontology Mapping Using metrics to compare OWL concepts (Euzenat and Volchev) QOM (Ehrig and Staab) • Semantic Annotation • Using Ontologies Corpus of Matches (O.Etzioni, A. Halevy, APFEL (Ehrig, Staab, Sure) • Applications SAT Reasoning - S-Match (U. Trento; Serafini et al) Mapping Composition: Semantic gossiping (Aberer et al), Piazza (Halevy et al), Prasenjit Mitra 99 100
  26. 26. CREAM – Creating Metadata Annotation by Markup [K-CAP 2001; [K-CAP 2001] WWW 2002] Generate Generate Class Class Instance Download of Instance markup-only version of Attribute Attribute OntoMat from Instance Instance DAML Onto- http://annotation. Agents Relationship Relationship Instance Instance 101 102 Annotation by Authoring [WWW 2003] Annotation vs. Deep Annotation Input Annotation Output Ontology [WWW 2002] Create Text and Ontology based- if possible Links Metadata out of a Class Instance Input Deep Annotation Output Attribute Instance Relationship Ontology Instance generates simple text Mapping Rules 103 DB Database 104 DB
  27. 27. The annotation problem from a scientific point The annotation problem in 4 cartoons of view 105 106 © Cimiano The annotation problem in practice The vicious cycle 107 108
  28. 28. Current State-of-the-art Semi-automatic Annotation • ML-based IE (e.g.Amilcare@{OntoMat,MnM}) [EKAW 2002] – start with hand-annotated training corpus – rule induction • Standard IE (MUC) – handcrafted rules – Wrappers • Large-scale IE [SemTag&Seeker@WWW‘03] EU IST – Large scale system Dot-Kom – disambiguation with TAP • (C-)Pankow (Cimiano WWW’04, WWW’05) • KnowItAll (Etzioni et al. WWW‘04) 109 110 Comparison of CREAM and S-CREAM Different Results Core processes: Input, Output <hotel> Zwei Linden </hotel> Zwei Linden InstOf Hotel – (M) Manual Annotation (OntoMat) Relational Metadata Zwei Linden Locatet_At Dobbertin – (A1) Information Extraction (Amilcare) XML annotated Dokument <city>Dobbertin</city> Dobbertin InstOf City Zwei Linden Has_Room single_room_1 M <singleroom>Single room</singleroom> single_room1 InstOf Single_Room single_room1 Has_Rate rate1 Thing rate1 InstOf Rate <price>25,66</price> rate1 Price 25,66 <hotel> region accommodation <currency>EUR</currency> rate1 Currency EUR A1 Zwei Linden Located_at Zwei Linden Has_Room double_room1 </hotel> Document IE <city> Dobbertin ? City Located_at Hotel <doubleroom>Double room</doubleroom> double_room1 InstOf Double_Room double_room1 Has_Rate rate2 rate2 InstOf Rate </city> Dobbertin Zwei Linden <price>43,66</price> rate2 Price 43,46 <currency>EUR</currency> rate2 Currency EUR Amilcare (IE-Tool) OntoMat-Annotizer 111 112
  29. 29. Comparison of CREAM and S-CREAM IE and Wrapper Learning Core processes: Input, Output – (M) Manual Annotation (OntoMat) Relational Metadata • Boosted wrapper induction – (A1) Information extraction (Amilcare) XML annotated Document • Exploiting linguistic constraints M • Hidden Markov models Thing • Data mining and IE <hotel> DR region accommodation • Bootstrapping Zwei Linden A2 A3 Located_at • First-order learning A1 Hotel </hotel> Document IE City Hotel City <city> Dobbertin Hotel Located_at </city> City Dobbertin Zwei Linden Currently: Simple Centering-Modell Future: Learn Coherency Rules 113 114 Wrapper SemTag No tutorial about IE and Wrapper learning but… • The goal is to add semantic tags to the existing HTML body of the web. • IE often focuses on small number of classes • SemTag uses TAP, where TAP is a public broad, shallow knowledgebase. • Is not easily adaptable to new domains • TAP Contains lexical and taxonomical information • Needs a lot of trainings examples about popular objects like music, movies, sports, etc. Needed Example: “The Chicago Bulls announced that Michael Jordan will…” Will be: • It would be great if IE would scale to a large number The <resource ref = of classes (concepts) on a large amount of unlabeled Team_Bulls>Chicago Bulls</resource> announced yesterday data that <resource ref = “ AthleteJordan_Michael”> Michael Jordan</resource> will...’’ 115 116 Dill et al, SemTag and Seeker. WWW’03
  30. 30. SemTag The Self-Annotating Web • Lookup of all instances from the ontology (TAP) – 65K instances • There is a huge amount of non-formalized • Disambiguate the occurrences as: knowledge in the Web – One of those in the taxonomy – Not present in the taxonomy • Placing labels in the taxonomy is hard • Use statistics to interpret this non-formalized • Use bag-of-words approach for disambiguation knowledge and propose formal annotations: • 3 people evaluated 200 labels in context – agreed on only 68.5% - metonymy semantics ≈ syntax + statistics? • Applied on 264 million pages • Produced 550 million labels and 434 spots • Accuracy 82% • Annotation by maximal statistical evidence 117 118 Dill et al, SemTag and Seeker. WWW’03 PANKOW: Pattern-based ANnotation through Knowledge On the Web Patterns (Cont‘d) • HEARST1: <CONCEPT>s such as <INSTANCE> • DEFINITE1: the <INSTANCE> <CONCEPT> • HEARST2: such <CONCEPT>s as <INSTANCE> • DEFINITE2: the <CONCEPT> <INSTANCE> • HEARST3: <CONCEPT>s, (especially/including) <INSTANCE> • HEARST4: <INSTANCE> (and/or) other <CONCEPT>s • APPOSITION:<INSTANCE>, a <CONCEPT> • COPULA: <INSTANCE> is a <CONCEPT> • Examples: – countries such as Niger • Examples: – such countries as Niger • the Niger country – countries, especially Niger • the country Niger – countries, including Niger • Niger, a country in Africa – Niger and other countries instanceOf(Niger,country) instanceOf(Niger,country) • Niger is a country in Africa – Niger or other countries 119 120
  31. 31. PANKOW Process Gimme‘ The Context: C-PANKOW • Contextualize the pattern-matching by taking into account the similarity of the Google-abstract in which the pattern was matched and the one to be annotated • Download a fixed number n of Google-abstracts matching so-called clues and analyze them linguistically, matching the patterns offline: – match more complex structures – more efficient as the number of Google-queries only depends on n – more offline processing, reducing network traffic 121 122 Comparison Web-scale information extraction System # Recall/ Learning Accuracy KnowItAll Idea: Accuracy – Web is the largest knowledge base [MUC-7] 3 >> 90% n.a. – The goal is to find all instances corresponding to a given concept in the [Fleischman02] 8 70.4% n.a. web and extract them PANKOW 59 24.9% 58.91% The System is: [Hahn98] –TH 325 21% 67% – Domain-Independent [Hahn98]-CB 325 26% 73% – Use Bootstrap technique – Based on Linguistic Patterns [Hahn98]-CB 325 31% 76% KnowItAll vs (C-)Pankow C-PANKOW 682 29.35% 74.37% - Pankow starts from a Web page and annotates a given term on the page using the Web [Alfonseca02] 1200 17.39% 44% (strict) - KnowItAll starts from a concept and aims at finding all instances on the Web LA based on least common superconcept123 124 O. Etzioni, 2004. Etzioni, lcs of two concepts (Hahn 98)