• Like
  • Save
A Pattern-Based Approach to Hyponymy Relation Acquisition for the Agricultural Thesaurus
Upcoming SlideShare
Loading in...5
×
 

A Pattern-Based Approach to Hyponymy Relation Acquisition for the Agricultural Thesaurus

on

  • 644 views

Presentation held by Makoto Nakamura, Ryusei Kobayashi, Yasuhiro Ogawa, Katsuhiko Toyama at the Agricultural Ontology Service (AOS) Workshop 2012 in Kutching, Sarawak, Malaysia from September 3 - 4, ...

Presentation held by Makoto Nakamura, Ryusei Kobayashi, Yasuhiro Ogawa, Katsuhiko Toyama at the Agricultural Ontology Service (AOS) Workshop 2012 in Kutching, Sarawak, Malaysia from September 3 - 4, 2012

Statistics

Views

Total Views
644
Views on SlideShare
638
Embed Views
6

Actions

Likes
1
Downloads
2
Comments
0

1 Embed 6

http://aims.fao.org 6

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    A Pattern-Based Approach to Hyponymy Relation Acquisition for the Agricultural Thesaurus A Pattern-Based Approach to Hyponymy Relation Acquisition for the Agricultural Thesaurus Presentation Transcript

    • A Pattern-Based Approach toHyponymy Relation Acquisitionfor the Agricultural Thesaurus*Makoto NakamuraRyusei KobayashiYasuhiro OgawaKatsuhiko Toyama(Nagoya University, Japan)
    • First of All...•  Japan Legal Information Institute, Graduate School of Law, Nagoya University, Japan •  established in order to provide Japan s legal information to the world •  Tasks •  Base for issuing English translations of Japanese legal information •  Provision of Japanese legal information to overseas •  Development of software for legislative support •  etc.•  Natural language processing for legal texts
    • Outline1.  Introduction 1.  Example of a hyponymy relation 2.  AGROVOC 3.  Purpose2.  Previous Works on Legal Text Processing3.  Acquisition of Legal Terms from Legal Texts4.  Experiments5.  Conclusion
    • IntroductionThe goal of this study:•  to construct a legal ontology based on legal termsLegal terms are:•  special, idiomatic expressions that often describe legal matters in legal documents•  defined by law prior to use
    • Example of a Hyponymy Relation in a Law (1/2)Input text:漁業法第六条3 「定置漁業」とは、漁具を定置して営む漁業であつて次に掲げるものをいう。 (以下略)Fishery ActArticle 6-3 The “fixed gear fishery” refers to a fishery operated with fixedgear, which falls under any of the following items. *snip*
    • Example of a Hyponymy Relation in a Law (2/2)Input text:漁業法第六条3 「定置漁業」とは、漁具を定置して営む漁業であつて次に掲げるものをいう。 (以下略)Fishery ActArticle 6-3 The “fixed gear fishery” refers to a fishery operated with fixedgear, which falls under any of the following items. *snip*Hyponymy Relation Hypernym: 漁業 / fishery Hyponym: 定置漁業 / fixed gear fishery
    • AGROVOC AGROVOC (Niu et al., 2012)•  the world s most comprehensive multilingual agricultural vocabulary•  contains more than 40,000 concepts in 21 languages•  covers topics on food, nutrition, agriculture, fisheries, forestry, environment, and other related domains•  expressed in a Simple Knowledge Organization System (SKOS) and published as Linked Data•  All the terms or concepts have been added to the thesaurus by the domain experts in different languages.•  This laborious human work is very time consuming and expensive.
    • Purpose•  to acquire hyponymy relations from the legal corpus•  to increment the vocabulary of AGROVOCAssumptionLegal terms are qualified for AGROVOCas long as they are related to theagricultural domain.
    • Example of Word Tree in AGROVOC Labels Status Scope Created Last modified fisheries (EN) Descriptor (20) n/a 1981-01-09 1981-01-09 00:00:00 related term broader term activities fishery used for economics economic activities Commercial fisheries Fishery Fishing industryoceanography fisheries fishing methods narrower term Marine fisheries Inland fisheries AGROVOC capture fisheries
    • How to Add Legal Terms to AGROVOC Labels Status Scope Created Last modified fisheries (EN) Descriptor (20) n/a 1981-01-09 1981-01-09 00:00:00 related term broader term activities fishery used for economics economic activities Commercial fisheries Fishery Fishing industryoceanography fisheries fisheries fishing methods fixed gear fisheries narrower term Marine fisheries Inland fisheries AGROVOC capture fisheries
    • Outline1.  Introduction2.  Previous Works on Legal Text Processing3.  Acquisition of Legal Terms from Legal Texts4.  Experiments5.  Conclusion
    • Previous Works on LegalText Processing•  Legal text processing using surface pattern recognition •  Knowledge acquisition from itemized expressions (Kimura et al., 2008) •  Detection of legal definitions (Höfler et al., 2012) Surface pattern recognition is sufficient for boilerplate (fixed) expressions•  Hyponymy relation acquisition •  The expressions y is a (kind of) x, such x as y (Miller et al., 1990, Hearst., 1992) •  This approach is applicable to Japanese (Ando et al., 2004) Legal ontologies could automatically be constructed from legal texts containing boilerplate expressions.
    • Outline1.  Introduction2.  Previous Works on Legal Text Processing3.  Acquisition of Legal Terms from Legal Texts 1.  Extracting terms and their explanations in Japanese legal texts 2.  Text processing for hyponymy relations4.  Experiments5.  Conclusion
    • Legal Corpus•  A set of statutory sentences from laws and regulations•  A set of 109,380 Japanese legal sentences in 241 laws and regulations.•  A wide variety of laws and regulations Bankruptcy Act / Measurement Act / Act on Promotion of Global Warming Countermeasures, etc.
    • Step-1: Example of SurfacePattern Rules Pattern of Definitions /「(.+)」とは、(.+)(を、|をいい、|といい、|という。|とする。)/ /”(.+)” (as used in this Act)? (shall mean|means) (.+)./Input text:ガス事業法第二条 この法律において「一般ガス事業」とは、一般の需要に応じ導管によりガスを供給する事業をいう。第二条10 この法律において「ガス事業」とは、一般ガス事業、簡易ガス事業、ガス導管事業及び大口ガス事業をいう。Gas Business ActArticle 2-1 The term “General Gas Utility Business” as used in this Act shallmean the business of supplying gas via pipelines to meet generaldemand.Article 2-10 The term “Gas Business” as used in this Act shall mean aGeneral Gas Utility Business, Community Gas Utility Business, Gas PipelineService Business or Large-Volume Gas Business.
    • Step-1: Example of SurfacePattern Rules Pattern of Definitions /「(.+)」とは、(.+)(を、|をいい、|といい、|という。|とする。)/ /”(.+)” (as used in this Act)? (shall mean|means) (.+)./Input text:ガス事業法第二条 この法律において「一般ガス事業」とは、一般の需要に応じ導管によりガスを供給する事業をいう。第二条10 この法律において「ガス事業」とは、一般ガス事業、簡易ガス事業、ガス導管事業及び大口ガス事業をいう。Gas Business ActArticle 2-1 The term “General Gas Utility Business” as used in this Act shallmean the business of supplying gas via pipelines to meet generaldemand.Article 2-10 The term “Gas Business” as used in this Act shall mean aGeneral Gas Utility Business, Community Gas Utility Business, Gas PipelineService Business or Large-Volume Gas Business.
    • Step-1: Acquisition ofDefinitions and ExplanationsOutput definition:1.  Term: 一般ガス事業 Explanation: 一般の需要に応じ導管によりガスを供給する事業2.  Term: ガス事業 Explanation: 一般ガス事業、簡易ガス事業、ガス導管事業及び大口ガス事業1.  Term: General Gas Utility Business Explanation: the business of supplying gas via pipelines to meet general demand2.  Term: Gas Business Explanation: a General Gas Utility Business, Community Gas Utility Business, Gas Pipeline Service Business or Large-Volume Gas BusinessWe made 6 patterns for extracting definitionsfrom legal corpus.
    • Step-2: Extraction of the Hypernym from a Dependency Tree (intensive) Head word Hypernym 一般の 需要に 応じ 導管に より ガスを 供給する 事業general demand meet pipelines via gas supply business Explanation of ‘General Gas Utility Business’ ( the business of supplying gas via pipelines to meet general demand ) •  Intensive (hypernym), extensive (hyponym), and mixed patterns •  Head word(s) becomes a hypernym of the term. •  CaboCha ‒ a Japanese dependency parser (Kudo et al., 2002) •  Complement to the parser with special terms and syntactic rules peculiar to the legal domain (Ogawa et al., 2011)
    • Step-2: Extraction of the Hyponyms by a cue phrase (extensive) Hyponym Hyponym Hyponym Hyponym 、 、一般ガス事業 簡易ガス事業 ガス導管事業 及び 大口ガス事業 , ,General Gas Community Gas Gas Pipeline or Large-VolumeUtility Business Utility Business Service Business Gas Business Explanation of ‘Gas Business’ ( General Gas Utility Business, Community Gas Utility Business, Gas Pipeline Service Business or Large-Volume Gas Business ) •  classification by cue phrases (comma (,) and or ) •  Hyponymy relation - a tuple of two noun phrases and a conceptual relation Hyponymy Relation Hypernym: ガス事業 / Gas Business Hyponym: 一般ガス事業 / General Gas Utility Business
    • Outline1.  Introduction2.  Previous Works on Legal Text Processing3.  Acquisition of Legal Terms from Legal Texts4.  Experiments5.  Conclusion
    • Experiments Hypernym •  Purpose Hyponym •  Acquisition of hyponymy relations existing qualified for AGROVOC new(i) (ii) •  The legal corpus •  109,380 Japanese legal sentences •  Classification of hyponymy relations(iii) (iv) •  Category (i): Neither the hypernym or the hyponym is registered in AGROVOC. •  Category (ii): Only the hyponym is not registered. •  Category (iii): Only the hypernym is not AGROVOC registered. •  Category (iv): Both are registered.
    • Experimental Result Hypernym Hyponym Experimental result in finding terms related to AGROVOC new existing Category of a hyponymy pair # of types Precision Category (i)-(iv) 1,027 †64.0%(i) (ii) Category (ii) & (iii) 222 67.1% Category (ii) 137 89.1%(iii) (iv) Category (iii) 75 21.3% Unknown 10 - Category (iv) 25 88.0% Existing relations 9 88.9% New relations 16 87.5% AGROVOC † is calculated from 100 samples chosen at random.
    • Example of Hyponymy Hypernym HyponymRelations new existing Category Example 1 Example 2 (i) district court *maximum limit bankruptcy court total allowable effort (ii) business *injurious plant General Gas Utility Business fungus (iii) oocyte *measuring instrument Unfertilized Egg equipment (iv) greenhouse gases real property Carbon dioxide land - *common fishery -- fishery
    • Experimental Result Hypernym Hyponym Experimental result in finding terms related to AGROVOC new existing Category of a hyponymy pair # of types Precision Category (i)-(iv) 1,027 †64.0%(i) (ii) Category (ii) & (iii) 222 67.1% Category (ii) 137 89.1%(iii) (iv) Category (iii) 75 21.3% Unknown 10 - Category (iv) 25 88.0% Existing relations 9 88.9% New relations 16 87.5% AGROVOC † is calculated from 100 samples chosen at random.
    • Example of Hyponymy Hypernym HyponymRelations new existing Category Example 1 Example 2 (i) district court *maximum limit bankruptcy court total allowable effort (ii) business *injurious plant General Gas Utility Business fungus (iii) oocyte *measuring instrument Unfertilized Egg equipment (iv) greenhouse gases real property Carbon dioxide land - *common fishery -- fishery
    • Experimental Result Hypernym Hyponym Experimental result in finding terms related to AGROVOC new existing Category of a hyponymy pair # of types Precision Category (i)-(iv) 1,027 †64.0%(i) (ii) Category (ii) & (iii) 222 67.1% Category (ii) 137 89.1%(iii) (iv) Category (iii) 75 21.3% Unknown 10 - Category (iv) 25 88.0% Existing relations 9 88.9% New relations 16 87.5% AGROVOC † is calculated from 100 samples chosen at random.
    • Example of Hyponymy Hypernym HyponymRelations new existing Category Example 1 Example 2 (i) district court *maximum limit bankruptcy court total allowable effort (ii) business *injurious plant General Gas Utility Business fungus (iii) oocyte *measuring instrument Unfertilized Egg equipment (iv) greenhouse gases real property Carbon dioxide land - *common fishery -- fishery
    • Experimental Result Hypernym Hyponym Experimental result in finding terms related to AGROVOC new existing Category of a hyponymy pair # of types Precision Category (i)-(iv) 1,027 †64.0%(i) (ii) Category (ii) & (iii) 222 67.1% Category (ii) 137 89.1%(iii) (iv) Category (iii) 75 21.3% Unknown 10 - Category (iv) 25 88.0% Existing relations 9 88.9% New relations 16 87.5% AGROVOC † is calculated from 100 samples chosen at random.
    • Example of Hyponymy Hypernym HyponymRelations new existing Category Example 1 Example 2 (i) district court *maximum limit bankruptcy court total allowable effort (ii) business *injurious plant General Gas Utility Business fungus (iii) oocyte *measuring instrument Unfertilized Egg equipment (iv) greenhouse gases real property Carbon dioxide land - *common fishery -- fishery
    • Conclusion•  Since legal documents are likely to use fixed expressions, surface pattern rules work well for term extraction.•  We succeeded in finding 222 terms that seem qualified for AGROVOC with high precision.•  Some error-prone rules and a procedural mistake are detected.•  We plan to expand our method to multilingualism. •  As long as boilerplate expressions are used often, our simple method is applicable to any language. •  The other method is to use bilingual lexicons as a dictionary (Jin et al., 2012)
    • Thank you A Pattern-Based Approach to Hyponymy Relation Acquisition for the Agricultural Thesaurus Makoto Nakamura ( mnakamur@law.nagoya-u.ac.jp )