SlideShare a Scribd company logo
1 of 17
PATTY:
A Taxonomy of Relational Patterns
with Semantic Types
Authors:
      Ndapandula Nakashole, Gerhard Weikum, Fabian Suchanek
                           (Max Planck Institute for Informatics)
Expositor:
                                                Akihiro Kameda
                       (Aizawa Lab. in The University of Tokyo)
Abstract
●   Syntactic
    + Ontological
    + Lexical
●   Mining algorithm
    ●   Ontological+Lexical
    ●   +Syntactic
    ●   Taxonomy construction
●   Mined result
    ●   5
        Experimental
        Evaluation
SOL Pattern and synset
●   Example: Syntactic + Ontological + Lexical
    ●   <person>'s [adj] voice * <song>
    ●   “Amy Winehouse's soft voice in 'Rehab'”
●   Type signature: <person> × <song>
●   Support set: {(Amy, Rehab), (Elvis, AllshookUp)}
●   Synset
    ●   syntactically general: X matches A ⊆ Y matches B
    ●   semantically general: P supports A ⊆ Q support B
    ●   synonymous: P⊆semQ ∧ Q⊆semP
Mining algorithm
●   Pattern extraction and generalization
    ●   Lexical + Ontological → +Syntactic
●   Taxonomy Construction
    ●   Find subsumption relationship
    ●   Integrate them into DAG (directed acyclic graph)
Pattern Extraction
●   Prepare surface name and semantic type dict
       –YAGO2, Freebase
●   Disambiguation
       – Context-similarity prior proposed by Suchanek 2009
●   Yields dependency path and connect 2 NE
       –   Stanford Parser
●   “Winehouse effortlessly performed her song Rehab”
    →”Amy Winehouse effortlessly performed Rehab(song)”
Syntactic Pattern Generalization
●   Lexicon to POS-tags, wild-cards, or types
    ●   Amy Winehouse's soft voice in 'Rehab'
    ●   <person>'s soft voice in <song>
    ●   <person>'s [adj] voice * <song>
●   Generate all possible generalization at first.
●   If that subsumes multiple patterns with disjoint
    support sets, that is rejected.
Taxonomy Construction
●   Compare every pattern support?
    ●   Too slow. → Use Prefix-tree method (Han 2005)




● Frequency ordered (descending)
● total <= |total entity pairs|

● depth <= |largest support set|
Taxonomy Construction
          ●   Traversing the tree
              in bottom up manner.
          ●   Find subsumption
              by finding set inclusion
          ●
                            soft
              p3 is nearly included by p4
Wilson estimator
       B              B       S            B
           S
                                               S


●   Naively, deg(S⊆ B) = |S∩B|/|S|
●   |S| should be considered also...
●   Regard S as random sample from S'
●   [c-d, c+d] (c≒0.5, d≒0.5→c≒|S∩B|/|S|, d≒0)
●   deg(S⊂ B) = c-d
                                           λ=Zα/2=1.96
DAG Construction
●   Eliminate cyclic edge as few as possible
●   … is NP hard.
●   Greedy algorithm
    ●   add by Wilson score order
    ●   if the relation path exists already or creates a cycle,
        do not add.
Mined Result (5 experiments)
●   2 data
    ●   the New York Times archive (NYT) which includes about 1.8
        Million newspaper articles from the years 1987 to 2007
    ●   the English edition of Wikipedia (WKP), which contains about
        3.8 Million articles (as of June 21, 2011)
●   2 knowledge base
    ●   YAGO2 consists of about 350,000 semantic classes from
        WordNet and the Wikipedia category system
    ●
        Freebase consists of 85 domains and a total of about 2000
        types within these domains
●
    Ordered or Random sampling
    ●
        typed/untyped order
Summary of experiment
●   High precision
●   High recall
●   WKP > NYT
●   YAGO2 > Freebase
●   Type is strong information
●   Interesting
Summary
●   Syntactic + Ontological + Lexical Patterns
    with taxonomy tree
●   350,569 synset      / precision 84.7%
      8,162 subsumption / precision 75.0%
●   Available online!
    http://www.mpi-inf.mpg.de/yago-naga/patty/
質疑応答
●   synsetの作成はhard inclusionでやってるのか?
    ●   曖昧にしか書かれていないけれど、おそらく
        soft inclusionで相互に強くinclusionしてたらsynset
        にしているのでは?
●   高瀬さんの紹介された論文にtaxonomy
    constructionのとこだけくっつけられないか?

More Related Content

Viewers also liked

Medeleini Esitlus
Medeleini EsitlusMedeleini Esitlus
Medeleini Esitlus
andresta
 
Tv Ad Presentation
Tv Ad PresentationTv Ad Presentation
Tv Ad Presentation
bryceives
 
2010 Jan Online
2010 Jan Online2010 Jan Online
2010 Jan Online
xuanloc
 
2014 Developers' Choice Awards Reveal Database Trends
2014 Developers' Choice Awards Reveal Database Trends2014 Developers' Choice Awards Reveal Database Trends
2014 Developers' Choice Awards Reveal Database Trends
Progress
 
A, E, J & J P R E S E N T A T I O N
A, E, J & J  P R E S E N T A T I O NA, E, J & J  P R E S E N T A T I O N
A, E, J & J P R E S E N T A T I O N
guest1b1543
 

Viewers also liked (20)

What Are Dreams
What Are DreamsWhat Are Dreams
What Are Dreams
 
3 Simple Ways to Simplify Your Mobile Apps
3 Simple Ways to Simplify Your Mobile Apps3 Simple Ways to Simplify Your Mobile Apps
3 Simple Ways to Simplify Your Mobile Apps
 
Lewis Diagram
Lewis DiagramLewis Diagram
Lewis Diagram
 
C He Mreview
C He MreviewC He Mreview
C He Mreview
 
Esitlus
EsitlusEsitlus
Esitlus
 
Amakusa Rick
Amakusa RickAmakusa Rick
Amakusa Rick
 
Medeleini Esitlus
Medeleini EsitlusMedeleini Esitlus
Medeleini Esitlus
 
From Static To Dynamic
From Static To DynamicFrom Static To Dynamic
From Static To Dynamic
 
Tv Ad Presentation
Tv Ad PresentationTv Ad Presentation
Tv Ad Presentation
 
Chemrev4
Chemrev4Chemrev4
Chemrev4
 
Quantum Pres
Quantum PresQuantum Pres
Quantum Pres
 
2010 Jan Online
2010 Jan Online2010 Jan Online
2010 Jan Online
 
Momentum Pres
Momentum PresMomentum Pres
Momentum Pres
 
Progress Rollbase: Building Powerful Applications One Block at a Time
Progress Rollbase:  Building Powerful Applications One Block at a TimeProgress Rollbase:  Building Powerful Applications One Block at a Time
Progress Rollbase: Building Powerful Applications One Block at a Time
 
2014 Developers' Choice Awards Reveal Database Trends
2014 Developers' Choice Awards Reveal Database Trends2014 Developers' Choice Awards Reveal Database Trends
2014 Developers' Choice Awards Reveal Database Trends
 
A, E, J & J P R E S E N T A T I O N
A, E, J & J  P R E S E N T A T I O NA, E, J & J  P R E S E N T A T I O N
A, E, J & J P R E S E N T A T I O N
 
A,E,J &J Presentation
A,E,J &J PresentationA,E,J &J Presentation
A,E,J &J Presentation
 
Newton’S Laws Practice
Newton’S Laws PracticeNewton’S Laws Practice
Newton’S Laws Practice
 
Chembond
ChembondChembond
Chembond
 
Health Related CMC Backgrounds
Health Related CMC BackgroundsHealth Related CMC Backgrounds
Health Related CMC Backgrounds
 

Similar to PATTY: A Taxonomy of Relational Patterns with Semantic Types

Wide-Coverage CCG Parsing with Quantifier Scope
Wide-Coverage CCG Parsing with Quantifier ScopeWide-Coverage CCG Parsing with Quantifier Scope
Wide-Coverage CCG Parsing with Quantifier Scope
dimkart
 
Aspectual concord and aspectual relativization nantes
Aspectual concord and aspectual relativization nantesAspectual concord and aspectual relativization nantes
Aspectual concord and aspectual relativization nantes
barsenijevic
 
Tensor-based Models of Natural Language Semantics
Tensor-based Models of Natural Language SemanticsTensor-based Models of Natural Language Semantics
Tensor-based Models of Natural Language Semantics
Dimitrios Kartsaklis
 

Similar to PATTY: A Taxonomy of Relational Patterns with Semantic Types (20)

Asynchronous Stochastic Optimization, New Analysis and Algorithms
Asynchronous Stochastic Optimization, New Analysis and AlgorithmsAsynchronous Stochastic Optimization, New Analysis and Algorithms
Asynchronous Stochastic Optimization, New Analysis and Algorithms
 
10 logic+programming+with+prolog
10 logic+programming+with+prolog10 logic+programming+with+prolog
10 logic+programming+with+prolog
 
Metaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisMetaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical Analysis
 
Wide-Coverage CCG Parsing with Quantifier Scope
Wide-Coverage CCG Parsing with Quantifier ScopeWide-Coverage CCG Parsing with Quantifier Scope
Wide-Coverage CCG Parsing with Quantifier Scope
 
NICE Implementations of Variational Inference
NICE Implementations of Variational Inference NICE Implementations of Variational Inference
NICE Implementations of Variational Inference
 
NICE Research -Variational inference project
NICE Research -Variational inference projectNICE Research -Variational inference project
NICE Research -Variational inference project
 
Aspectual concord and aspectual relativization nantes
Aspectual concord and aspectual relativization nantesAspectual concord and aspectual relativization nantes
Aspectual concord and aspectual relativization nantes
 
Tree-based Translation Models (『機械翻訳』§6.2-6.3)
Tree-based Translation Models (『機械翻訳』§6.2-6.3)Tree-based Translation Models (『機械翻訳』§6.2-6.3)
Tree-based Translation Models (『機械翻訳』§6.2-6.3)
 
Compiler Components and their Generators - Traditional Parsing Algorithms
Compiler Components and their Generators - Traditional Parsing AlgorithmsCompiler Components and their Generators - Traditional Parsing Algorithms
Compiler Components and their Generators - Traditional Parsing Algorithms
 
L03 ai - knowledge representation using logic
L03 ai - knowledge representation using logicL03 ai - knowledge representation using logic
L03 ai - knowledge representation using logic
 
Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introduction
 
Sergey Nikolenko and Elena Tutubalina - Constructing Aspect-Based Sentiment ...
Sergey Nikolenko and  Elena Tutubalina - Constructing Aspect-Based Sentiment ...Sergey Nikolenko and  Elena Tutubalina - Constructing Aspect-Based Sentiment ...
Sergey Nikolenko and Elena Tutubalina - Constructing Aspect-Based Sentiment ...
 
Dissertation Defense: The Physics of DNA, RNA, and RNA-like polymers
Dissertation Defense: The Physics of DNA, RNA, and RNA-like polymersDissertation Defense: The Physics of DNA, RNA, and RNA-like polymers
Dissertation Defense: The Physics of DNA, RNA, and RNA-like polymers
 
AB-RNA-SCFG-2010
AB-RNA-SCFG-2010AB-RNA-SCFG-2010
AB-RNA-SCFG-2010
 
Lexical Analysis
Lexical AnalysisLexical Analysis
Lexical Analysis
 
Parallel Optimization in Machine Learning
Parallel Optimization in Machine LearningParallel Optimization in Machine Learning
Parallel Optimization in Machine Learning
 
Encoding Generalized Quantifiers in Dependency-based Compositional Semantics
Encoding Generalized Quantifiers in Dependency-based Compositional SemanticsEncoding Generalized Quantifiers in Dependency-based Compositional Semantics
Encoding Generalized Quantifiers in Dependency-based Compositional Semantics
 
Search4similars
Search4similarsSearch4similars
Search4similars
 
Natural Language parsing.pptx
Natural Language parsing.pptxNatural Language parsing.pptx
Natural Language parsing.pptx
 
Tensor-based Models of Natural Language Semantics
Tensor-based Models of Natural Language SemanticsTensor-based Models of Natural Language Semantics
Tensor-based Models of Natural Language Semantics
 

More from Akihiro Kameda (7)

iPRES Day 3 instruction
iPRES Day 3 instructioniPRES Day 3 instruction
iPRES Day 3 instruction
 
Evaluating Visual Representations for Topic Understanding and Their Effect...
Evaluating Visual Representations for Topic Understanding and Their Effect...Evaluating Visual Representations for Topic Understanding and Their Effect...
Evaluating Visual Representations for Topic Understanding and Their Effect...
 
ISWC2016 1-slide-survey
ISWC2016 1-slide-surveyISWC2016 1-slide-survey
ISWC2016 1-slide-survey
 
いかにしてデータを手に入れるか
いかにしてデータを手に入れるかいかにしてデータを手に入れるか
いかにしてデータを手に入れるか
 
"Joint Extraction of Events and Entities within a Document Context"の解説
"Joint Extraction of Events and Entities within a Document Context"の解説"Joint Extraction of Events and Entities within a Document Context"の解説
"Joint Extraction of Events and Entities within a Document Context"の解説
 
Iodd2015osaka kameda-slideshare
Iodd2015osaka kameda-slideshareIodd2015osaka kameda-slideshare
Iodd2015osaka kameda-slideshare
 
Reference Scope Identification in Citing Sentences
Reference Scope Identification in Citing SentencesReference Scope Identification in Citing Sentences
Reference Scope Identification in Citing Sentences
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

PATTY: A Taxonomy of Relational Patterns with Semantic Types

  • 1. PATTY: A Taxonomy of Relational Patterns with Semantic Types Authors: Ndapandula Nakashole, Gerhard Weikum, Fabian Suchanek (Max Planck Institute for Informatics) Expositor: Akihiro Kameda (Aizawa Lab. in The University of Tokyo)
  • 2. Abstract ● Syntactic + Ontological + Lexical ● Mining algorithm ● Ontological+Lexical ● +Syntactic ● Taxonomy construction ● Mined result ● 5 Experimental Evaluation
  • 3. SOL Pattern and synset ● Example: Syntactic + Ontological + Lexical ● <person>'s [adj] voice * <song> ● “Amy Winehouse's soft voice in 'Rehab'” ● Type signature: <person> × <song> ● Support set: {(Amy, Rehab), (Elvis, AllshookUp)} ● Synset ● syntactically general: X matches A ⊆ Y matches B ● semantically general: P supports A ⊆ Q support B ● synonymous: P⊆semQ ∧ Q⊆semP
  • 4. Mining algorithm ● Pattern extraction and generalization ● Lexical + Ontological → +Syntactic ● Taxonomy Construction ● Find subsumption relationship ● Integrate them into DAG (directed acyclic graph)
  • 5. Pattern Extraction ● Prepare surface name and semantic type dict –YAGO2, Freebase ● Disambiguation – Context-similarity prior proposed by Suchanek 2009 ● Yields dependency path and connect 2 NE – Stanford Parser ● “Winehouse effortlessly performed her song Rehab” →”Amy Winehouse effortlessly performed Rehab(song)”
  • 6. Syntactic Pattern Generalization ● Lexicon to POS-tags, wild-cards, or types ● Amy Winehouse's soft voice in 'Rehab' ● <person>'s soft voice in <song> ● <person>'s [adj] voice * <song> ● Generate all possible generalization at first. ● If that subsumes multiple patterns with disjoint support sets, that is rejected.
  • 7. Taxonomy Construction ● Compare every pattern support? ● Too slow. → Use Prefix-tree method (Han 2005) ● Frequency ordered (descending) ● total <= |total entity pairs| ● depth <= |largest support set|
  • 8. Taxonomy Construction ● Traversing the tree in bottom up manner. ● Find subsumption by finding set inclusion ● soft p3 is nearly included by p4
  • 9. Wilson estimator B B S B S S ● Naively, deg(S⊆ B) = |S∩B|/|S| ● |S| should be considered also... ● Regard S as random sample from S' ● [c-d, c+d] (c≒0.5, d≒0.5→c≒|S∩B|/|S|, d≒0) ● deg(S⊂ B) = c-d λ=Zα/2=1.96
  • 10. DAG Construction ● Eliminate cyclic edge as few as possible ● … is NP hard. ● Greedy algorithm ● add by Wilson score order ● if the relation path exists already or creates a cycle, do not add.
  • 11. Mined Result (5 experiments) ● 2 data ● the New York Times archive (NYT) which includes about 1.8 Million newspaper articles from the years 1987 to 2007 ● the English edition of Wikipedia (WKP), which contains about 3.8 Million articles (as of June 21, 2011) ● 2 knowledge base ● YAGO2 consists of about 350,000 semantic classes from WordNet and the Wikipedia category system ● Freebase consists of 85 domains and a total of about 2000 types within these domains ● Ordered or Random sampling ● typed/untyped order
  • 12.
  • 13.
  • 14.
  • 15. Summary of experiment ● High precision ● High recall ● WKP > NYT ● YAGO2 > Freebase ● Type is strong information ● Interesting
  • 16. Summary ● Syntactic + Ontological + Lexical Patterns with taxonomy tree ● 350,569 synset / precision 84.7% 8,162 subsumption / precision 75.0% ● Available online! http://www.mpi-inf.mpg.de/yago-naga/patty/
  • 17. 質疑応答 ● synsetの作成はhard inclusionでやってるのか? ● 曖昧にしか書かれていないけれど、おそらく soft inclusionで相互に強くinclusionしてたらsynset にしているのでは? ● 高瀬さんの紹介された論文にtaxonomy constructionのとこだけくっつけられないか?

Editor's Notes

  1. 何をしたかというと、SとOとLを合わせたようなルールの枠組みを定めそれをコーパスから抽出した論文になっています。 組み合わせたってどういうこと?ってのは百聞は一見にしかずで、これを見て頂ければいいと思います。 あとはそれをどうやって抽出したかという手法 大規模にやったらどうなったかという結果 がそれぞれこの論文の面白みになっています。 LexicalとTextualは似たような意味でつかわれている
  2. DAGはツリー構造だって言った方が分かりやすい
  3. パーザは最短パスを辿ることと主語っぽいのから始まるのだけ抜き出すのにつかっている
  4. A, B, C の説明 オーダーの工夫と総数と高さ
  5. 範囲はWilson score interval だがどうせ使うのはc-dのみなのでタイトルはestimatorにした λは信頼度に伴って無限大まで大きくなる