Automatic Hypernym Classification: Towards the Induction of ...
Upcoming SlideShare
Loading in...5
×
 

Automatic Hypernym Classification: Towards the Induction of ...

on

  • 676 views

 

Statistics

Views

Total Views
676
Views on SlideShare
676
Embed Views
0

Actions

Likes
0
Downloads
5
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Automatic Hypernym Classification: Towards the Induction of ... Automatic Hypernym Classification: Towards the Induction of ... Presentation Transcript

  • CS 124/LINGUIST 180: From Languages to Information
    Dan Jurafsky
    Lecture 14: Information Extraction and Semantic Relation learning
    Lots of slides from many people, including Rion Snow, Jim Martin, Chris Manning, and William Cohen,
  • 2
    Background: Information Extraction
    Extract information from text
    Sometimes called text analyticscommercially
    Extract entities (the people, organizations, locations, times, dates, genes, diseases, medicines, etc. in a text)
    Extract the relations between entities
    Figure out the larger events that are taking place
  • 3
    Information Extraction
    Creating knowledge bases and ontologies
    Implications for cognitive modeling
    Digital Libaries
    Google scholar, Citeseer need to extract the title, author and references
    Bioinformatics
    Patent analysis
    Specific market segments for stock analysis
    SEC filings
    Intelligence analysis
  • Outline
    Reminder: Named Entity Tagging
    Relation Extraction
    Hand-built patterns
    Seed (bootstrap) methods
    Supervised classification
    Distant supervision
  • What is “Information Extraction”
    As a task:
    Filling slots in a database from sub-segments of text.
    October 14, 2002, 4:00 a.m. PT
    For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.
    Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.
    "We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.“
    Richard Stallman, founder of the Free Software Foundation, countered saying…
    NAME TITLE ORGANIZATION
    Slide from William Cohen
  • What is “Information Extraction”
    As a task:
    Filling slots in a database from sub-segments of text.
    October 14, 2002, 4:00 a.m. PT
    For years, Microsoft CorporationCEOBill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.
    Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.
    "We can be open source. We love the concept of shared source," said Bill Veghte, a MicrosoftVP. "That's a super-important shift for us in terms of code access.“
    Richard Stallman, founder of the Free Software Foundation, countered saying…
    IE
    NAME TITLE ORGANIZATION
    Bill GatesCEOMicrosoft
    Bill VeghteVPMicrosoft
    Richard StallmanfounderFree Soft..
    Slide from William Cohen
  • What is “Information Extraction”
    As a familyof techniques:
    Information Extraction =
    segmentation + classification + clustering + association
    October 14, 2002, 4:00 a.m. PT
    For years, Microsoft CorporationCEOBill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.
    Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.
    "We can be open source. We love the concept of shared source," said Bill Veghte, a MicrosoftVP. "That's a super-important shift for us in terms of code access.“
    Richard Stallman, founder of the Free Software Foundation, countered saying…
    Microsoft Corporation
    CEO
    Bill Gates
    Microsoft
    Gates
    Microsoft
    Bill Veghte
    Microsoft
    VP
    Richard Stallman
    founder
    Free Software Foundation
    “named entity extraction”
    Slide from William Cohen
  • What is “Information Extraction”
    As a familyof techniques:
    Information Extraction =
    segmentation + classification + association + clustering
    October 14, 2002, 4:00 a.m. PT
    For years, Microsoft CorporationCEOBill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.
    Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.
    "We can be open source. We love the concept of shared source," said Bill Veghte, a MicrosoftVP. "That's a super-important shift for us in terms of code access.“
    Richard Stallman, founder of the Free Software Foundation, countered saying…
    Microsoft Corporation
    CEO
    Bill Gates
    Microsoft
    Gates
    Microsoft
    Bill Veghte
    Microsoft
    VP
    Richard Stallman
    founder
    Free Software Foundation
    Slide from William Cohen
  • What is “Information Extraction”
    As a familyof techniques:
    Information Extraction =
    segmentation + classification+ association + clustering
    October 14, 2002, 4:00 a.m. PT
    For years, Microsoft CorporationCEOBill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.
    Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.
    "We can be open source. We love the concept of shared source," said Bill Veghte, a MicrosoftVP. "That's a super-important shift for us in terms of code access.“
    Richard Stallman, founder of the Free Software Foundation, countered saying…
    Microsoft Corporation
    CEO
    Bill Gates
    Microsoft
    Gates
    Microsoft
    Bill Veghte
    Microsoft
    VP
    Richard Stallman
    founder
    Free Software Foundation
    Slide from William Cohen
  • What is “Information Extraction”
    TITLE ORGANIZATION
    NAME
    Bill Gates
    CEO
    Microsoft
    Bill
    Veghte
    VP
    Microsoft
    Free Soft..
    Stallman
    founder
    Richard
    As a familyof techniques:
    Information Extraction =
    segmentation + classification+ association+ clustering
    October 14, 2002, 4:00 a.m. PT
    For years, Microsoft CorporationCEOBill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.
    Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.
    "We can be open source. We love the concept of shared source," said Bill Veghte, a MicrosoftVP. "That's a super-important shift for us in terms of code access.“
    Richard Stallman, founder of the Free Software Foundation, countered saying…
    Microsoft Corporation
    CEO
    Bill Gates
    Microsoft
    Gates
    Microsoft
    Bill Veghte
    Microsoft
    VP
    Richard Stallman
    founder
    Free Software Foundation
    *
    *
    *
    *
    Slide from William Cohen
  • Extracting Structured Knowledge
    Each article can contain hundreds or thousands of items of knowledge...
    “The Lawrence Livermore National Laboratory (LLNL) in Livermore, California is a scientific research laboratory founded by the University of California in 1952.”
    LLNL EQ Lawrence Livermore National Laboratory
    LLNL LOC-IN California
    Livermore LOC-IN California
    LLNL IS-A scientific research laboratory
    LLNL FOUNDED-BY University of California
    LLNL FOUNDED-IN 1952
  • 12
    Goal: Machine-readable summaries
    Structured knowledge extraction: Summary for machine
    Textual abstract:
    Summary for human
  • From Unstructured Text to Structured Knowledge
    Unstructured Text
    News articles...
    slide from Rion Snow
  • From Unstructured Text to Structured Knowledge
    Unstructured Text
    Blog posts....
    slide from Rion Snow
  • From Unstructured Text to Structured Knowledge
    Unstructured Text
    Scientific journal articles...
    slide from Rion Snow
  • From Unstructured Text to Structured Knowledge
    Unstructured Text
    Tweets, instant messages, chat logs...
    slide from Rion Snow
  • From Unstructured Text to Structured Knowledge
    Unstructured Text
    slide from Rion Snow
  • From Unstructured Text to Structured Knowledge
    Structured Knowledge
    Unstructured Text
    slide from Rion Snow
  • From Unstructured Text to Structured Knowledge
    Structured Knowledge
    Unstructured Text
    slide from Rion Snow
  • From Unstructured Text to Structured Knowledge
    Structured Knowledge
    Unstructured Text
    slide from Rion Snow
  • From Unstructured Text to Structured Knowledge
    Structured Knowledge
    Unstructured Text
    slide from Rion Snow
  • From Unstructured Text to Structured Knowledge
    Structured Knowledge
    Unstructured Text
    slide from Rion Snow
  • From Unstructured Text to Structured Knowledge
    Structured Knowledge
    Unstructured Text
    slide from Rion Snow
  • Reminder: Task 1: Named Entity Tagging
    • General NER or Biomedical NER
    <PER> John Hennessy</PER> is a professor at <ORG> Stanford University </ORG>, in <LOC> Palo Alto </LOC>.
    <RNA> TAR </RNA> independent transactivation by <PROTEIN> Tat </PROTEIN>in cells derived from the <CELL> CNS </CELL>- a novel mechanism of <DNA> HIV-1 gene </DNA>regulation.
    Slide from Chris Manning
  • Reminder:Maximum Entropy Markov Model
    DNA
    O
    DNA
    O
    regulation
    HIV−1
    gene
    of
    Slide from Chris Manning
  • Task II: Relation Extraction
  • Relations between words
    Language Understanding Applications needs word meaning!
    Question answering
    Conversational agents
    Summarization
    One key meaning component: word relations
    Hierarchical (ontological) relations
    “San Francisco” ISA “city”
    Other relations between words
    “alternator” is a part of a “car”
  • Relation Prediction
    “...works by such authors as Herrick, Goldsmith, and Shakespeare.”
    “If you consider authors like Shakespeare...”
    “Some authors (including Shakespeare)...”
    “Shakespeare was the author of several...”
    “Shakespeare, author of The Tempest...”
    ShakespeareIS-A author(0.87)
    How can we capture the variability of expression of a relation in natural text from a large, unannotated corpus?
  • Hyponymy
    One sense is a hyponym of another if the first sense is more specific, denoting a subclass of the other
    car is a hyponym of vehicle
    dog is a hyponym of animal
    mango is a hyponym of fruit
    Conversely
    vehicle is a hypernym/superordinate of car
    animal is a hypernym of dog
    fruit is a hypernym of mango
  • 30
    WordNet relations
    X is-a-kind-of Y
    (hyponym / hypernym)
    X is-a-part-of Y
    (meronym / holonym)
    slide from Rion Snow
  • WordNet is incomplete; ontological relations are missing for many words
    This is especially true for specific domains (restaurants, auto parts, finance)
  • Other kinds of Relations: Disease Outbreaks
    Extract structured information from text
    Slide from Eugene Agichtein
    May 19 1995, Atlanta -- The Centers for Disease Control and Prevention, which is in the front line of the world's response to the deadly Ebola epidemic in Zaire , is finding itself hard pressed to cope with the crisis…
    Disease Outbreaks in The New York Times
    Information Extraction System (e.g., NYU’s Proteus)
  • More relations: Protein Interactions
    interact
    complex
    CBF-A CBF-C
    CBF-B CBF-A-CBF-C complex
    associates
    „We show that CBF-A and CBF-C interact
    with each other to form a CBF-A-CBF-C complex
    and that CBF-B does not interact with CBF-A or
    CBF-C individually but that it associates with the
    CBF-A-CBF-C complex.“
    Slide from Rosario and Hearst
  • Yet More Relations
    CHICAGO (AP) — Citing high fuel prices, United Airlines said Friday it has increased fares by $6 per round trip on flights to some cities also served by lower-cost carriers. American Airlines, a unit AMR, immediately matched the move, spokesman Tim Wagner said. United, a unit of UAL, said the increase took effect Thursday night and applies to most routes where it competes against discount carriers, such as Chicago to Dallas and Atlanta and Denver to San Francisco, Los Angeles and New York
    Slide from Jim Martin
  • Relation Types
    For generic news texts...
    Slide from Jim Martin
  • More relations: UMLS
    Unified Medical Language System
    integrates linguistic, terminological and semantic information
    Semantic Network consists of 134 semantic types and 54 relations between types
    Pharmacologic Substance affects Pathologic Function
    Pharmacologic Substance causes Pathologic Function
    Pharmacologic Substance complicates Pathologic Function
    Pharmacologic Substance diagnoses Pathologic Function
    Pharmacologic Substance prevents Pathologic Function
    Pharmacologic Substance treats Pathologic Function
    Slide from Paul Buitelaar
  • Relations in Ontologies: GO (Gene Ontology)
    GO (Gene Ontology)
    Aligns descriptions of gene products in different databases, including plant, animal and microbial genomes
    Organizing principles are molecular function, biological process and cellular component
    Accession: GO:0009292
    Ontology: biological process
    Synonyms: broad: genetic exchange
    Definition: In the absence of a sexual life cycle, the processes involved in the introduction of genetic information to create a genetically different individual.
    Term Lineage all : all (164142)
    GO:0008150 : biological process (115947)
    GO:0007275 : development (11892)
    GO:0009292 : genetic transfer (69)
    Slide from Paul Buitelaar
  • Relations in Ontologies: geographical
    Ontology
    F-Logic
    similar
    Geographical Entity (GE)
    is-a
    flow_through
    Inhabited GE
    Natural GE
    capital_of
    city
    river
    mountain
    country
    instance_of
    located_in
    Neckar
    Zugspitze
    Germany
    capital_of
    length (km)
    height (m)
    flow_through
    located_in
    Berlin
    Stuttgart
    367
    2962
    flow_through
    Design: Philipp Cimiano
    Slide from Paul Buitelaar
  • 39
    MeSH (Medical Subject Headings) Thesaurus
    Definition
    MeSH Descriptor
    Synonym set
    Slide from Illhoi Yoo, Xiaohua (Tony) Hu,and Il-Yeol Song
  • MeSH Tree
    MeSH Ontology
    Hierarchically arranged from most general to most specific.
    Actually a graph rather than a tree
    normally appear in more than one place in the tree
    MeSH Tree
    Slide from Illhoi Yoo, Xiaohua (Tony) Hu,and Il-Yeol Song
  • Slide from Doug Appelt
    Types of ACE Relations, 2003
    ROLE - relates a person to an organization or a geopolitical entity
    Subtypes: member, owner, affiliate, client, citizen
    PART - generalized containment
    Subtypes: subsidiary, physical part-of, set membership
    AT - permanent and transient locations
    Subtypes: located, based-in, residence
    SOCIAL- social relations among persons
    Subtypes: parent, sibling, spouse, grandparent, associate
  • Frequent Freebase Relations
    a
  • Predicting the “is-a” relation
    “...works by such authors as Herrick, Goldsmith, and Shakespeare.”
    “If you consider authors like Shakespeare...”
    “Some authors (including Shakespeare)...”
    “Shakespeare was the author of several...”
    “Shakespeare, author of The Tempest...”
    ShakespeareIS-A author(0.87)
    How can we capture the variability of expression of a relation in natural text from a large, unannotated corpus?
  • Treatment
    Disease
    Why this is hard: Ambiguity!Which relations hold between 2 entities?
    Cure?
    Prevent?
    Side Effect?
  • Different relations between Disease (Hepatitis) and Treatment
    Cure
    These results suggest that con A-induced hepatitis was ameliorated by pretreatment with TJ-135.
    Prevent
    A two-dose combined hepatitis A and Bvaccine would facilitate immunization programs
    Vague
    Effect of interferon on hepatitis B
    Slide from Barbara Rosario and Marti Hearst
  • 5 easy methods for relation extraction
    Hand-built patterns
    Supervised methods
    Bootstrapping (seed) methods
    Unsupervised methods
    Distant supervision
  • 5 easy methods for relation extraction
    Hand-built patterns
    Bootstrapping (seed) methods
    Supervised methods
    Unsupervised methods
    Distant supervision
  • A complex hand-built extraction rule [NYU Proteus]
  • Goal: Add hyponyms to WordNet directly from text.
    Intuition from Hearst (1992)
    “Agar is a substance prepared from a mixture of red algae, such as Gelidium, for laboratory or industrial use”
    What does Gelidium mean?
    How do you know?`
  • Goal: Add hyponyms to WordNet directly from text.
    Intuition from Hearst (1992)
    “Agar is a substance prepared from a mixture of red algae, such as Gelidium, for laboratory or industrial use”
    What does Gelidium mean?
    How do you know?`
  • Hearst’s Hand-Designed Lexico-Syntactic Patterns
    (Hearst, 1992): Automatic Acquisition of Hyponyms
    “Y such as X ((, X)* (, and/or) X)”
    “such Y as X…”
    “X… or other Y”
    “X… and other Y”
    “Y including X…”
    “Y, especially X…”
  • Hearst’s hand-built patterns for Relation Extraction
  • Problem with hand-built patterns
    Requires that we hand-build patterns for each relation!
    don’t want to have to do this for all possible relations!
    we’d like better accuracy
  • 5 easy methods for relation extraction
    Hand-built patterns
    Supervised methods
    Bootstrapping (seed) methods
    Unsupervised methods
    Distant supervision
  • 2. Supervised Relation Extraction
    Sometimes done in 3 steps
    Find all pairs of named entities
    Decide if 2 entities are related
    If yes, classifying the relation
    Why the extra step?
    Cuts down on training time for classification by eliminating most pairs
    Producing separate feature-sets that are appropriate for each task.
    55
  • Relation Analysis
    Usually just run on named entities within the same sentence
    Slide from Jim Martin
  • Slide from Jing Jiang
    Relation Extraction
    Task definition: to label the semantic relation between a pair of entities in a sentence (fragment)
    …[leaderarg-1] of a minority [governmentarg-2]…
    PHYS
    PER-SOC
    EMP-ORG
    NIL
    PHYS: Physical
    PER-SOC: Personal / Social
    EMP-ORG: Employment / Membership / Subsidiary
  • Supervised Learning
    Supervised machine learning (e.g. [Zhou et al. 2005], [Bunescu & Mooney 2005], [Zhang et al. 2006], [Surdeanu & Ciaramita 2007])
    Training data is needed for each relation type
    …[leaderarg-1] of a minority [governmentarg-2]…
    arg-1 word: leader
    arg-2 type: ORG
    dependency:
    arg-1  of  arg-2
    EMP-ORG
    PHYS
    PER-SOC
    NIL
    Slide from Jing Jiang
  • ACE 2008 Six Relations
  • Features: Words
    Headwords of M1 and M2, and combination
    George Washington Bridge
    Bag of words and bigrams in M1 and M2
    Words or bigrams in particular positions to the left and right of the M1 and M2
    +/- 1, 2, 3
    Bag of words or bigrams between the two entities
  • Features: Named Entity Type and Mention Level
    Named-entity types (ORG, LOC, etc)
    Concatenation of the types
    Entity Level of M1 and M2
    (NAME, NOMINAL, PRONOUN)
  • Features: Parse Tree and Base Phrases
    Syntactic environment
    Constituent path through the tree from one to the other
    Base syntactic chunk sequence from one to the other
    Dependency path
    Slide from Jim Martin
  • Features: Gazeteers and trigger words
    Personal relative trigger list
    from wordnet: parent, wife, husabnd, grandparent, etc
    Country name list
  • American Airlines, a unit of AMR, immediately matched the move, spokesman Tim Wagnersaid.
  • Classifiers for supervised methods
    Now you can use any classifier you like
    SVM
    Logistic regression
    Naïve Bayes
    etc
  • Summary
    Can get high accuracies with enough hand-labeled training data
    If test data looks exactly like the training data
    But
    labeling 5000 relations (and named entities) is expensive
    the approach doesn’t generalize to different genres
  • 5 easy methods for relation extraction
    Hand-built patterns
    Supervised methods
    Bootstrapping (seed) methods
    Unsupervised methods
    Distant supervision
  • Slide from Jim Martin
    Bootstrapping Approaches
    What if you don’t have enough annotated text to train on.
    But you might have some seed tuples
    Or you might have some patterns that work pretty well
    Can you use those seeds to do something useful?
    Co-training and active learning use the seeds to train classifiers to tag more data to train better classifiers...
    Bootstrapping tries to learn directly (populate a relation) through direct use of the seeds
  • Slide from Jim Martin
    Bootstrapping Example: Seed Tuple
    <Mark Twain, Elmira> Seed tuple
    Grep (google)
    “Mark Twain is buried in Elmira, NY.”
    X is buried in Y
    “The grave of Mark Twain is in Elmira”
    The grave of X is in Y
    “Elmira is Mark Twain’s final resting place”
    Y is X’s final resting place.
    Use those patterns to grep for new tuples that you don’t already know
  • Hearst (1992) proposal for bootstrapping
    Choose lexical relation R.
    Gather a set of pairs that have this relation
    Find places in the corpus where these expressions occur near each other and record the environment.
    Find the commonalities among these environments and hypothesize that common ones yield patterns that indicate the relation of interest.
  • Slide from Jim Martin
    Bootstrapping Relations
  • Dipre (Brin 1998)
    Extract <author, book> pairs.
    Start with these 5 seeds
    Learn these patterns:
    Now iterate, using these patterns to get more instances and patterns…
  • Snowball (Agichtein and Gravano 2000)
    New idea: require that X and Y be named entities of particular types
    {<’s 0.7> <in 0.7> <headquarters 0.7>}
    LOCATION
    ORGANIZATION
    {<- 0.75> <based 0.75>}
    LOCATION
    ORGANIZATION
  • 5 easy methods for relation extraction
    Hand-built patterns
    Supervised methods
    Bootstrapping (seed) methods
    Unsupervised methods
    Distant supervision
  • Distant supervision paradigm
    Snow, Jurafsky, Ng. 2005. Learning syntactic patterns for automatic hypernym discovery. NIPS 17
    Mintz, Bills, Snow, Jurafsky (2009) Distant supervision for relation extraction without labeled data. ACL-2009.
    Instead of hand-creating 5 seed examples
    Use a large database to get our seed examples
    lots of examples
    supervision from a database, not a corpus!
    Not genre-dependent!
    Create lots and lots of noisy features from all these examples
    Combine in a classifier
  • Distant supervision paradigm
    Has advantages of supervised classification:
    use of rich of hand-created knowledge
    Has advantages of unsupervised classification:
    infinite amounts of data
    allows for very large number of weak features
    not sensitive to training corpus
  • 77
    Relation Classification with “Distant Supervision”
    We construct a noisy training set consisting of occurrences from our corpus that contain an IS-A pair according to WordNet.
    Slide from Rion Snow
  • 78
    Relation Classification with “Distant Supervision”
    We construct a noisy training set consisting of occurrences from our corpus that contain an IS-A pair according to WordNet.
    Slide from Rion Snow
  • 79
    Relation Classification with “Distant Supervision”
    We construct a noisy training set consisting of occurrences from our corpus that contain an IS-A pair according to WordNet.
    Slide from Rion Snow
  • 80
    Relation Classification with “Distant Supervision”
    We construct a noisy training set consisting of occurrences from our corpus that contain an IS-A pair according to WordNet.
    Slide from Rion Snow
  • 81
    Relation Classification with “Distant Supervision”
    We construct a noisy training set consisting of occurrences from our corpus that contain an IS-A pair according to WordNet.
    This leads to high-signal examples like:
    “...consider authors like Shakespeare...”
    “Some authors (including Shakespeare)...”
    “Shakespeare was the author of several...”
    “Shakespeare, author of The Tempest...”
    Slide from Rion Snow
  • 82
    Relation Classification with “Distant Supervision”
    We construct a noisy training set consisting of occurrences from our corpus that contain an IS-A pair according to WordNet.
    This leads to high-signal examples like:
    “...consider authors like Shakespeare...”
    “Some authors (including Shakespeare)...”
    “Shakespeare was the author of several...”
    “Shakespeare, author of The Tempest...”
    But noisy examples like:
    “The author of Shakespeare in Love...”
    “...authors at the ShakespeareFestival...”
    Training set (TREC and Wikipedia):
    14,000 hypernym pairs, ~600,000 total pairs
    Slide from Rion Snow
  • How to learn patterns
    Snow, Jurafsky, Ng. 2005. Learning syntactic patterns for automatic hypernym discovery. NIPS 17
    … doubly heavy hydrogen atomcalleddeuterium…
    Take corpus sentences
    Collect noun pairs
    752,311 pairs from 6M words of newswire
    Is pair an IS-A in WordNet?
    14,387 yes, 737,924 no
    Parse the sentences
    Extract patterns
    69,592 dependency paths >5 pairs)
    Train classifier on these patterns
    Logistic regression with 70K features(actually converted to 974,288 bucketed binary features)
    1
    (Atom, deuterium)
    2
    YES
    3
    4
    5
    6
  • One of 70,000 patterns
    “<superordinate> ‘called’ <subordinate>”
    Learned from cases such as:
    “sarcoma / cancer”: …an uncommon bone cancercalled osteogenicsarcoma and to…
    “deuterium / atom” ….heavy water rich in the doubly heavy hydrogen atomcalleddeuterium.
    New pairs discovered:
    “efflorescence / condition”: …and a conditioncalledefflorescence are other reasons for…
    “’neal_inc / company” …The company, now called O'Neal Inc., was sole distributor of E-Ferol…
    “hat_creek_outfit / ranch” …run a small ranch called the Hat Creek Outfit.
    “hiv-1 / aids_virus” …infected by the AIDS virus, called HIV-1.
    “bateau_mouche / attraction” …local sightseeing attraction called the Bateau Mouche...
    “kibbutz_malkiyya / collective_farm” …an Israeli collective farm called Kibbutz Malkiyya…
  • Hypernym Precision / Recall for all Features
    Slide from Rion Snow
  • Hypernym Precision / Recall for all Features
    Slide from Rion Snow
  • Hypernym Precision / Recall for all Features
    Slide from Rion Snow
  • Hypernym Precision / Recall for all Features
    Slide from Rion Snow
  • Idea: use each pattern as a feature!!!!Precision/Recall for Hypernym Classification:
    logistic regression
    10-fold Cross Validation on 14,000 WordNet-Labeled Pairs
    slide from Rion Snow
  • Outline
    Reminder: Named Entity Tagging
    Relation Extraction
    Hand-built patterns
    Seed (bootstrap) methods
    Supervised classification
    Distant supervision