SlideShare a Scribd company logo

An Alignment-based Pattern Representation Model for Information Extraction

Seokhwan Kim
Seokhwan Kim
Seokhwan KimScientist at Institute for Infocomm Research

An Alignment-based Pattern Representation Model for Information Extraction

1 of 1
Download to read offline
An Alignment-based Pattern Representation Model
                                              for Information Extraction
                                                                         Seokhwan Kim, Minwoo Jeong, Gary Geunbae Lee
                                                                             {megaup, stardust, gblee}@postech.ac.kr
 Abstract - In this paper, we propose an alternative pattern representation model and the effective method of utilizing it. While the previous pattern
 representation models completely depend on the result of dependency analysis, our approach is basically based on the lexical alignment and
 considers the result of dependency analysis only as a meaningful feature of the alignment process. In this way, we can cope with the errors of
 incomplete dependency analysis. An evaluation of a scenario template task shows that our proposed model outperforms the previous
 syntax-dependent models.


                                                                  Pattern Representation Model for Information Extraction
 Information Extraction                                                               Pattern Representation Model for IE                                        Related Works
 Extracting the defined number of relevant                                            Problem Definition                                                          Lexical Sequence Pattern Models
arguments from natural language documents                                                    Ex)                                                                     A set of lexical sequences
 Subtasks                                                                                                                                                                   <hum_tgt> be kidnapped
                                                                                           About 50 peasants have been kidnapped by terrorists of the FMNL
                                                                                                                                                                             be kidnapped by <prep_ind>
 # of arguments                         subtask
        1                      named-entity recognition                                                                             Extraction Pattern                Syntax-dependent Pattern Models
        2                      binary relation extraction                                                                                   ?                           A set of subtrees (from D-tree)
   more than 2                 relation/event extraction
                                                                                                             incident type   kidnapping                                                      kidnapped
 Approach                                                                                                   prep_ind        terrorists                                         nsubjpass                        agent


   Automatic Pattern Learning                                                                               prep_org        FMNL                                               peasants                  terrorists
                                                                                                             hum_tgt         peasants
     Pattern Representation Model                                                                                                                                                                                       prep_of


     Pattern Learning Algorithm                                                                                                                                    (kidnapped ({HUM_TGT}-nsubjpass)    FMNL
                                                                                                                                                                    (kidnapped ({PREP_IND}-agent))
                                                                                                                                                                    (kidnapped ({PREP_IND}-agent ({PREP_ORG}-prep_of)))

                                                                                                                    Method
 Our Approach                                                                                        Pattern Sequences Extraction
  Pattern Model                                                                                     1) Searching the sentences containing all                    Ex)                      (3+1)/(0+1) = 4
    Lexical Sequence Pattern                                                                        arguments of each tuple in source documents
                                                                                                     2) Segmenting out subpart of the sentence                                    kidnapped
    + Term Weight (from Dependency Analysis)
                                                                                                     based on clausal boundaries                                nsubjpass                                agent
      <HUM_TGT>         of      [NP]     have    been      kidnapped   by     <PRED_IND>
                                                                                                     3) Replacing the parts of arguments in the
                                                                                                                                                                                (1+1)/(1+1) = 1                    (2+1)/(1+1) = 1.5
           1           0.33     0.33      4          4        4        1.5           1.5
                                                                                                     sub-sentence with argument labels
                                                                                                      Computing Term Weights                                  <HUM_TGT>                       <PREP_IND>
   Soft Pattern Matching
     Sequence Alignment                                                                                  wi = (ri + c) / (di + c)                              prep_of                                       prep_of

     about 50 peasants          have          been       kidnapped     by         terrorists              wi : weight of i-th term
                                                                                                          ri : number of relevant terms within                      [NP]                         <PREP_ORG>
                                                                                                              a subtree, ti as root
  <HUM_TGT>       of         [NP]      have      been      kidnapped         by     <PREP_IND>            di : distance from root node                     (0+1)/(2+1) = 0.33                            (1+1)/(2+1) =0.67
                                                                                                          c : for smoothing (default:1)
                                                                                                                                                         Experiment
 Pattern Matching
  Sequence Alignment                                                                                Experimental Setup                            Experimental Result
  Based on a Dynamic Programming                                                                     Data                                          Pattern Models
  Alignment Matrix                                                                                     MUC-3/4 Data                                  SVO Model (Yangarber ‘00)
               peasants have been kidnapped by terrorists                                                 About the Terrorism Events                  Linked-Chain Model (Greenwood ‘06)
   <HUM_TGT>      1       0   0        0     0     0
         of       0       1   0        0     0     0
                                                                                                        Simpler template structure with 4 slots       Subtree Model (Sudo ‘03)
        [NP]      0     0.66 1         0     0     0                                                      perp_ind, perp_org, phys_tgt, hum_tgt       Our Model
        have      0       4   3        2     1     0
        been      0       0   8        7     6     5                                                    Dev-set (training), Test-set (evaluation)   Result
     kidnapped    0       0   4       12     11    10                                                 Preprocessing                                      Model      Precision Recall F-measure
         by       0       0   0        8    13.5  12.5
   <PRED_IND>    1.5     0.5  0        0     0     15                                                   Dependency Parsing and NP-chunking                SVO        21.74    20.62    21.16

                                                                                                          Stanford Parser                             Linked-Chain   20.04    26.55    22.84
   Matrix Computation                                                                                Extracting Pattern Candidates                      Subtree     23.34    32.73    27.25
                                                                                                                                                        Alignment     23.35    45.62    30.89
                                       Mi-1,j-1 + sim i-1,j-1 * wi-1                                    Selecting all pattern candidates for test
                                       Mi-1,j + gp * wi-1                                               Without pattern filtering                     Our proposed model achieved much
          Mi,j = max
                                       Mi,j-1 + gp * wi                                                 To compare not the pattern filtering         higher recall than the other models with
                                       0                                                                  method, but the representative performance                      similar precision
                                                                                                          among pattern models
Ad

Recommended

Master of Science Thesis Defense - Souma (FIU)
Master of Science Thesis Defense - Souma (FIU)Master of Science Thesis Defense - Souma (FIU)
Master of Science Thesis Defense - Souma (FIU)Souma Chowdhury
 
OOP Chapter 6: Making Decisions
OOP Chapter 6: Making DecisionsOOP Chapter 6: Making Decisions
OOP Chapter 6: Making DecisionsAtit Patumvan
 
Predicting performance in Recommender Systems - Poster
Predicting performance in Recommender Systems - PosterPredicting performance in Recommender Systems - Poster
Predicting performance in Recommender Systems - PosterAlejandro Bellogin
 
ICMI 2012 Workshop on gesture and speech production
ICMI 2012 Workshop on gesture and speech productionICMI 2012 Workshop on gesture and speech production
ICMI 2012 Workshop on gesture and speech productionLê Anh
 
Performance optimization of hybrid fusion cluster based cooperative spectrum ...
Performance optimization of hybrid fusion cluster based cooperative spectrum ...Performance optimization of hybrid fusion cluster based cooperative spectrum ...
Performance optimization of hybrid fusion cluster based cooperative spectrum ...Ayman El-Saleh
 
Objectiveccheatsheet
ObjectiveccheatsheetObjectiveccheatsheet
Objectiveccheatsheetiderdelzo
 

More Related Content

What's hot

Detecting aspect-specific code smells using Ekeko for AspectJ
Detecting aspect-specific code smells using Ekeko for AspectJDetecting aspect-specific code smells using Ekeko for AspectJ
Detecting aspect-specific code smells using Ekeko for AspectJCoen De Roover
 
Dart function - Recursive functions
Dart function - Recursive functionsDart function - Recursive functions
Dart function - Recursive functionsKoAungThuOo1
 
Ekeko Technology Showdown at SoTeSoLa 2012
Ekeko Technology Showdown at SoTeSoLa 2012Ekeko Technology Showdown at SoTeSoLa 2012
Ekeko Technology Showdown at SoTeSoLa 2012Coen De Roover
 
The SOUL Tool Suite for Querying Programs in Symbiosis with Eclipse
The SOUL Tool Suite for Querying Programs in Symbiosis with EclipseThe SOUL Tool Suite for Querying Programs in Symbiosis with Eclipse
The SOUL Tool Suite for Querying Programs in Symbiosis with EclipseCoen De Roover
 
django095-cheat-sheet
django095-cheat-sheetdjango095-cheat-sheet
django095-cheat-sheetwebuploader
 
Scientific computing
Scientific computingScientific computing
Scientific computingBrijesh Kumar
 

What's hot (9)

Detecting aspect-specific code smells using Ekeko for AspectJ
Detecting aspect-specific code smells using Ekeko for AspectJDetecting aspect-specific code smells using Ekeko for AspectJ
Detecting aspect-specific code smells using Ekeko for AspectJ
 
Dart function - Recursive functions
Dart function - Recursive functionsDart function - Recursive functions
Dart function - Recursive functions
 
Ekeko Technology Showdown at SoTeSoLa 2012
Ekeko Technology Showdown at SoTeSoLa 2012Ekeko Technology Showdown at SoTeSoLa 2012
Ekeko Technology Showdown at SoTeSoLa 2012
 
JSTLQuick Reference
JSTLQuick ReferenceJSTLQuick Reference
JSTLQuick Reference
 
The SOUL Tool Suite for Querying Programs in Symbiosis with Eclipse
The SOUL Tool Suite for Querying Programs in Symbiosis with EclipseThe SOUL Tool Suite for Querying Programs in Symbiosis with Eclipse
The SOUL Tool Suite for Querying Programs in Symbiosis with Eclipse
 
Discovery Of Functional Protein Linear Motifs Using a Greaddy Algorithm and I...
Discovery Of Functional Protein Linear Motifs Using a Greaddy Algorithm and I...Discovery Of Functional Protein Linear Motifs Using a Greaddy Algorithm and I...
Discovery Of Functional Protein Linear Motifs Using a Greaddy Algorithm and I...
 
django095-cheat-sheet
django095-cheat-sheetdjango095-cheat-sheet
django095-cheat-sheet
 
Chtp405
Chtp405Chtp405
Chtp405
 
Scientific computing
Scientific computingScientific computing
Scientific computing
 

Viewers also liked

Presenter and Decorator in Rails
Presenter and Decorator in RailsPresenter and Decorator in Rails
Presenter and Decorator in RailsThaichor Seng
 
Functional reactive programming
Functional reactive programmingFunctional reactive programming
Functional reactive programmingAraf Karsh Hamid
 
Phani Kumar - Decorator Pattern
Phani Kumar - Decorator PatternPhani Kumar - Decorator Pattern
Phani Kumar - Decorator Patternmelbournepatterns
 
Builder Design Pattern (Generic Construction -Different Representation)
Builder Design Pattern (Generic Construction -Different Representation)Builder Design Pattern (Generic Construction -Different Representation)
Builder Design Pattern (Generic Construction -Different Representation)Sameer Rathoud
 
How I Learned To Apply Design Patterns
How I Learned To Apply Design PatternsHow I Learned To Apply Design Patterns
How I Learned To Apply Design PatternsAndy Maleh
 
Pattern Languages — An Approach to Holistic Knowledge Representation
Pattern Languages — An Approach to Holistic Knowledge RepresentationPattern Languages — An Approach to Holistic Knowledge Representation
Pattern Languages — An Approach to Holistic Knowledge RepresentationDouglas Schuler
 
Software Design Patterns - Selecting the right design pattern
Software Design Patterns - Selecting the right design patternSoftware Design Patterns - Selecting the right design pattern
Software Design Patterns - Selecting the right design patternJoao Pereira
 
Decorator Design Pattern
Decorator Design PatternDecorator Design Pattern
Decorator Design PatternAdeel Riaz
 

Viewers also liked (10)

Presenter and Decorator in Rails
Presenter and Decorator in RailsPresenter and Decorator in Rails
Presenter and Decorator in Rails
 
Functional reactive programming
Functional reactive programmingFunctional reactive programming
Functional reactive programming
 
Phani Kumar - Decorator Pattern
Phani Kumar - Decorator PatternPhani Kumar - Decorator Pattern
Phani Kumar - Decorator Pattern
 
Builder Design Pattern (Generic Construction -Different Representation)
Builder Design Pattern (Generic Construction -Different Representation)Builder Design Pattern (Generic Construction -Different Representation)
Builder Design Pattern (Generic Construction -Different Representation)
 
How I Learned To Apply Design Patterns
How I Learned To Apply Design PatternsHow I Learned To Apply Design Patterns
How I Learned To Apply Design Patterns
 
Pattern Languages — An Approach to Holistic Knowledge Representation
Pattern Languages — An Approach to Holistic Knowledge RepresentationPattern Languages — An Approach to Holistic Knowledge Representation
Pattern Languages — An Approach to Holistic Knowledge Representation
 
Observer pattern
Observer patternObserver pattern
Observer pattern
 
Software Design Patterns - Selecting the right design pattern
Software Design Patterns - Selecting the right design patternSoftware Design Patterns - Selecting the right design pattern
Software Design Patterns - Selecting the right design pattern
 
Decorator Design Pattern
Decorator Design PatternDecorator Design Pattern
Decorator Design Pattern
 
Design pattern
Design patternDesign pattern
Design pattern
 

More from Seokhwan Kim

The Eighth Dialog System Technology Challenge (DSTC8)
The Eighth Dialog System Technology Challenge (DSTC8)The Eighth Dialog System Technology Challenge (DSTC8)
The Eighth Dialog System Technology Challenge (DSTC8)Seokhwan Kim
 
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...Seokhwan Kim
 
Dynamic Memory Networks for Dialogue Topic Tracking
Dynamic Memory Networks for Dialogue Topic TrackingDynamic Memory Networks for Dialogue Topic Tracking
Dynamic Memory Networks for Dialogue Topic TrackingSeokhwan Kim
 
The Fifth Dialog State Tracking Challenge (DSTC5)
The Fifth Dialog State Tracking Challenge (DSTC5)The Fifth Dialog State Tracking Challenge (DSTC5)
The Fifth Dialog State Tracking Challenge (DSTC5)Seokhwan Kim
 
Natural Language in Human-Robot Interaction
Natural Language in Human-Robot InteractionNatural Language in Human-Robot Interaction
Natural Language in Human-Robot InteractionSeokhwan Kim
 
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...Seokhwan Kim
 
The Fourth Dialog State Tracking Challenge (DSTC4)
The Fourth Dialog State Tracking Challenge (DSTC4)The Fourth Dialog State Tracking Challenge (DSTC4)
The Fourth Dialog State Tracking Challenge (DSTC4)Seokhwan Kim
 
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...Seokhwan Kim
 
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...Seokhwan Kim
 
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...Seokhwan Kim
 
Sequential Labeling for Tracking Dynamic Dialog States
Sequential Labeling for Tracking Dynamic Dialog StatesSequential Labeling for Tracking Dynamic Dialog States
Sequential Labeling for Tracking Dynamic Dialog StatesSeokhwan Kim
 
Wikipedia-based Kernels for Dialogue Topic Tracking
Wikipedia-based Kernels for Dialogue Topic TrackingWikipedia-based Kernels for Dialogue Topic Tracking
Wikipedia-based Kernels for Dialogue Topic TrackingSeokhwan Kim
 
A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...
A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...
A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...Seokhwan Kim
 
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...Seokhwan Kim
 
MMR-based active machine learning for Bio named entity recognition
MMR-based active machine learning for Bio named entity recognitionMMR-based active machine learning for Bio named entity recognition
MMR-based active machine learning for Bio named entity recognitionSeokhwan Kim
 
A semi-supervised method for efficient construction of statistical spoken lan...
A semi-supervised method for efficient construction of statistical spoken lan...A semi-supervised method for efficient construction of statistical spoken lan...
A semi-supervised method for efficient construction of statistical spoken lan...Seokhwan Kim
 
A spoken dialog system for electronic program guide information access
A spoken dialog system for electronic program guide information accessA spoken dialog system for electronic program guide information access
A spoken dialog system for electronic program guide information accessSeokhwan Kim
 
An alignment-based approach to semi-supervised relation extraction including ...
An alignment-based approach to semi-supervised relation extraction including ...An alignment-based approach to semi-supervised relation extraction including ...
An alignment-based approach to semi-supervised relation extraction including ...Seokhwan Kim
 
EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템
EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템
EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템Seokhwan Kim
 
A Cross-Lingual Annotation Projection Approach for Relation Detection
A Cross-Lingual Annotation Projection Approach for Relation DetectionA Cross-Lingual Annotation Projection Approach for Relation Detection
A Cross-Lingual Annotation Projection Approach for Relation DetectionSeokhwan Kim
 

More from Seokhwan Kim (20)

The Eighth Dialog System Technology Challenge (DSTC8)
The Eighth Dialog System Technology Challenge (DSTC8)The Eighth Dialog System Technology Challenge (DSTC8)
The Eighth Dialog System Technology Challenge (DSTC8)
 
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
 
Dynamic Memory Networks for Dialogue Topic Tracking
Dynamic Memory Networks for Dialogue Topic TrackingDynamic Memory Networks for Dialogue Topic Tracking
Dynamic Memory Networks for Dialogue Topic Tracking
 
The Fifth Dialog State Tracking Challenge (DSTC5)
The Fifth Dialog State Tracking Challenge (DSTC5)The Fifth Dialog State Tracking Challenge (DSTC5)
The Fifth Dialog State Tracking Challenge (DSTC5)
 
Natural Language in Human-Robot Interaction
Natural Language in Human-Robot InteractionNatural Language in Human-Robot Interaction
Natural Language in Human-Robot Interaction
 
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...
 
The Fourth Dialog State Tracking Challenge (DSTC4)
The Fourth Dialog State Tracking Challenge (DSTC4)The Fourth Dialog State Tracking Challenge (DSTC4)
The Fourth Dialog State Tracking Challenge (DSTC4)
 
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
 
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
 
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...
 
Sequential Labeling for Tracking Dynamic Dialog States
Sequential Labeling for Tracking Dynamic Dialog StatesSequential Labeling for Tracking Dynamic Dialog States
Sequential Labeling for Tracking Dynamic Dialog States
 
Wikipedia-based Kernels for Dialogue Topic Tracking
Wikipedia-based Kernels for Dialogue Topic TrackingWikipedia-based Kernels for Dialogue Topic Tracking
Wikipedia-based Kernels for Dialogue Topic Tracking
 
A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...
A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...
A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...
 
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
 
MMR-based active machine learning for Bio named entity recognition
MMR-based active machine learning for Bio named entity recognitionMMR-based active machine learning for Bio named entity recognition
MMR-based active machine learning for Bio named entity recognition
 
A semi-supervised method for efficient construction of statistical spoken lan...
A semi-supervised method for efficient construction of statistical spoken lan...A semi-supervised method for efficient construction of statistical spoken lan...
A semi-supervised method for efficient construction of statistical spoken lan...
 
A spoken dialog system for electronic program guide information access
A spoken dialog system for electronic program guide information accessA spoken dialog system for electronic program guide information access
A spoken dialog system for electronic program guide information access
 
An alignment-based approach to semi-supervised relation extraction including ...
An alignment-based approach to semi-supervised relation extraction including ...An alignment-based approach to semi-supervised relation extraction including ...
An alignment-based approach to semi-supervised relation extraction including ...
 
EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템
EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템
EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템
 
A Cross-Lingual Annotation Projection Approach for Relation Detection
A Cross-Lingual Annotation Projection Approach for Relation DetectionA Cross-Lingual Annotation Projection Approach for Relation Detection
A Cross-Lingual Annotation Projection Approach for Relation Detection
 

Recently uploaded

Azure Migration Guide for IT Professionals
Azure Migration Guide for IT ProfessionalsAzure Migration Guide for IT Professionals
Azure Migration Guide for IT ProfessionalsChristine Shepherd
 
Curtain Module Manual Zigbee Neo CS01-1C.pdf
Curtain Module Manual Zigbee Neo CS01-1C.pdfCurtain Module Manual Zigbee Neo CS01-1C.pdf
Curtain Module Manual Zigbee Neo CS01-1C.pdfDomotica daVinci
 
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...Adrian Sanabria
 
AWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user groupAWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user groupAWS Chicago
 
Dynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringDynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringMassimo Talia
 
Bit N Build Poland
Bit N Build PolandBit N Build Poland
Bit N Build PolandGDSC PJATK
 
From eSIMs to iSIMs: It’s Inside the Manufacturing
From eSIMs to iSIMs: It’s Inside the ManufacturingFrom eSIMs to iSIMs: It’s Inside the Manufacturing
From eSIMs to iSIMs: It’s Inside the ManufacturingSoracom Global, Inc.
 
H3 Platform CXL Solution_Memory Fabric Forum.pptx
H3 Platform CXL Solution_Memory Fabric Forum.pptxH3 Platform CXL Solution_Memory Fabric Forum.pptx
H3 Platform CXL Solution_Memory Fabric Forum.pptxMemory Fabric Forum
 
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERNRonnelBaroc
 
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...Adrian Sanabria
 
M.Aathiraju Self Intro.docx-AD21001_____
M.Aathiraju Self Intro.docx-AD21001_____M.Aathiraju Self Intro.docx-AD21001_____
M.Aathiraju Self Intro.docx-AD21001_____Aathiraju
 
Power of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfPower of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfkatalinjordans1
 
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFEDNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFEandreiandasan
 
Unlocking the Cloud's True Potential: Why Multitenancy Is The Key?
Unlocking the Cloud's True Potential: Why Multitenancy Is The Key?Unlocking the Cloud's True Potential: Why Multitenancy Is The Key?
Unlocking the Cloud's True Potential: Why Multitenancy Is The Key?GleecusTechlabs1
 
Heltun_HE-RS01_User_Manual_B9AH.pdf
Heltun_HE-RS01_User_Manual_B9AH.pdfHeltun_HE-RS01_User_Manual_B9AH.pdf
Heltun_HE-RS01_User_Manual_B9AH.pdfMarielaL5
 
2) Presentation_Overview_ISO_16140-3_Method_verification_20210322.pptx
2) Presentation_Overview_ISO_16140-3_Method_verification_20210322.pptx2) Presentation_Overview_ISO_16140-3_Method_verification_20210322.pptx
2) Presentation_Overview_ISO_16140-3_Method_verification_20210322.pptxssuser796efb
 
My sample product research idea for you!
My sample product research idea for you!My sample product research idea for you!
My sample product research idea for you!KivenRaySarsaba
 
Manual sensor Zigbee 3.0 MOES ZSS-X-PIRL-C
Manual  sensor Zigbee 3.0 MOES ZSS-X-PIRL-CManual  sensor Zigbee 3.0 MOES ZSS-X-PIRL-C
Manual sensor Zigbee 3.0 MOES ZSS-X-PIRL-CDomotica daVinci
 
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre..."Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...shaiyuvasv
 

Recently uploaded (20)

Azure Migration Guide for IT Professionals
Azure Migration Guide for IT ProfessionalsAzure Migration Guide for IT Professionals
Azure Migration Guide for IT Professionals
 
Curtain Module Manual Zigbee Neo CS01-1C.pdf
Curtain Module Manual Zigbee Neo CS01-1C.pdfCurtain Module Manual Zigbee Neo CS01-1C.pdf
Curtain Module Manual Zigbee Neo CS01-1C.pdf
 
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
Early Tech Adoption: Foolish or Pragmatic? - 17th ISACA South Florida WOW Con...
 
AWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user groupAWS reInvent 2023 recaps from Chicago AWS user group
AWS reInvent 2023 recaps from Chicago AWS user group
 
Dynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineeringDynamical systems simulation in Python for science and engineering
Dynamical systems simulation in Python for science and engineering
 
Bit N Build Poland
Bit N Build PolandBit N Build Poland
Bit N Build Poland
 
From eSIMs to iSIMs: It’s Inside the Manufacturing
From eSIMs to iSIMs: It’s Inside the ManufacturingFrom eSIMs to iSIMs: It’s Inside the Manufacturing
From eSIMs to iSIMs: It’s Inside the Manufacturing
 
H3 Platform CXL Solution_Memory Fabric Forum.pptx
H3 Platform CXL Solution_Memory Fabric Forum.pptxH3 Platform CXL Solution_Memory Fabric Forum.pptx
H3 Platform CXL Solution_Memory Fabric Forum.pptx
 
5 Tech Trend to Notice in ESG Landscape- 47Billion
5 Tech Trend to Notice in ESG Landscape- 47Billion5 Tech Trend to Notice in ESG Landscape- 47Billion
5 Tech Trend to Notice in ESG Landscape- 47Billion
 
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
21ST CENTURY LITERACY FROM TRADITIONAL TO MODERN
 
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...
Avoiding Bad Stats and the Benefits of Playing Trivia with Friends: PancakesC...
 
M.Aathiraju Self Intro.docx-AD21001_____
M.Aathiraju Self Intro.docx-AD21001_____M.Aathiraju Self Intro.docx-AD21001_____
M.Aathiraju Self Intro.docx-AD21001_____
 
Power of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdfPower of 2024 - WITforce Odyssey.pptx.pdf
Power of 2024 - WITforce Odyssey.pptx.pdf
 
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFEDNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
DNA LIGASE BIOTECHNOLOGY BIOLOGY STUDY OF LIFE
 
Unlocking the Cloud's True Potential: Why Multitenancy Is The Key?
Unlocking the Cloud's True Potential: Why Multitenancy Is The Key?Unlocking the Cloud's True Potential: Why Multitenancy Is The Key?
Unlocking the Cloud's True Potential: Why Multitenancy Is The Key?
 
Heltun_HE-RS01_User_Manual_B9AH.pdf
Heltun_HE-RS01_User_Manual_B9AH.pdfHeltun_HE-RS01_User_Manual_B9AH.pdf
Heltun_HE-RS01_User_Manual_B9AH.pdf
 
2) Presentation_Overview_ISO_16140-3_Method_verification_20210322.pptx
2) Presentation_Overview_ISO_16140-3_Method_verification_20210322.pptx2) Presentation_Overview_ISO_16140-3_Method_verification_20210322.pptx
2) Presentation_Overview_ISO_16140-3_Method_verification_20210322.pptx
 
My sample product research idea for you!
My sample product research idea for you!My sample product research idea for you!
My sample product research idea for you!
 
Manual sensor Zigbee 3.0 MOES ZSS-X-PIRL-C
Manual  sensor Zigbee 3.0 MOES ZSS-X-PIRL-CManual  sensor Zigbee 3.0 MOES ZSS-X-PIRL-C
Manual sensor Zigbee 3.0 MOES ZSS-X-PIRL-C
 
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre..."Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
"Journey of Aspiration: Unveiling the Path to Becoming a Technocrat and Entre...
 

An Alignment-based Pattern Representation Model for Information Extraction

  • 1. An Alignment-based Pattern Representation Model for Information Extraction Seokhwan Kim, Minwoo Jeong, Gary Geunbae Lee {megaup, stardust, gblee}@postech.ac.kr Abstract - In this paper, we propose an alternative pattern representation model and the effective method of utilizing it. While the previous pattern representation models completely depend on the result of dependency analysis, our approach is basically based on the lexical alignment and considers the result of dependency analysis only as a meaningful feature of the alignment process. In this way, we can cope with the errors of incomplete dependency analysis. An evaluation of a scenario template task shows that our proposed model outperforms the previous syntax-dependent models. Pattern Representation Model for Information Extraction  Information Extraction  Pattern Representation Model for IE  Related Works  Extracting the defined number of relevant  Problem Definition  Lexical Sequence Pattern Models arguments from natural language documents  Ex)  A set of lexical sequences  Subtasks <hum_tgt> be kidnapped About 50 peasants have been kidnapped by terrorists of the FMNL be kidnapped by <prep_ind> # of arguments subtask 1 named-entity recognition Extraction Pattern  Syntax-dependent Pattern Models 2 binary relation extraction ?  A set of subtrees (from D-tree) more than 2 relation/event extraction incident type kidnapping kidnapped  Approach prep_ind terrorists nsubjpass agent  Automatic Pattern Learning prep_org FMNL peasants terrorists hum_tgt peasants  Pattern Representation Model prep_of  Pattern Learning Algorithm (kidnapped ({HUM_TGT}-nsubjpass) FMNL (kidnapped ({PREP_IND}-agent)) (kidnapped ({PREP_IND}-agent ({PREP_ORG}-prep_of))) Method  Our Approach  Pattern Sequences Extraction  Pattern Model 1) Searching the sentences containing all  Ex) (3+1)/(0+1) = 4  Lexical Sequence Pattern arguments of each tuple in source documents 2) Segmenting out subpart of the sentence kidnapped  + Term Weight (from Dependency Analysis) based on clausal boundaries nsubjpass agent <HUM_TGT> of [NP] have been kidnapped by <PRED_IND> 3) Replacing the parts of arguments in the (1+1)/(1+1) = 1 (2+1)/(1+1) = 1.5 1 0.33 0.33 4 4 4 1.5 1.5 sub-sentence with argument labels  Computing Term Weights <HUM_TGT> <PREP_IND>  Soft Pattern Matching  Sequence Alignment wi = (ri + c) / (di + c) prep_of prep_of about 50 peasants have been kidnapped by terrorists wi : weight of i-th term ri : number of relevant terms within [NP] <PREP_ORG> a subtree, ti as root <HUM_TGT> of [NP] have been kidnapped by <PREP_IND> di : distance from root node (0+1)/(2+1) = 0.33 (1+1)/(2+1) =0.67 c : for smoothing (default:1) Experiment  Pattern Matching  Sequence Alignment  Experimental Setup  Experimental Result  Based on a Dynamic Programming  Data  Pattern Models  Alignment Matrix  MUC-3/4 Data  SVO Model (Yangarber ‘00) peasants have been kidnapped by terrorists  About the Terrorism Events  Linked-Chain Model (Greenwood ‘06) <HUM_TGT> 1 0 0 0 0 0 of 0 1 0 0 0 0  Simpler template structure with 4 slots  Subtree Model (Sudo ‘03) [NP] 0 0.66 1 0 0 0  perp_ind, perp_org, phys_tgt, hum_tgt  Our Model have 0 4 3 2 1 0 been 0 0 8 7 6 5  Dev-set (training), Test-set (evaluation)  Result kidnapped 0 0 4 12 11 10  Preprocessing Model Precision Recall F-measure by 0 0 0 8 13.5 12.5 <PRED_IND> 1.5 0.5 0 0 0 15  Dependency Parsing and NP-chunking SVO 21.74 20.62 21.16  Stanford Parser Linked-Chain 20.04 26.55 22.84  Matrix Computation  Extracting Pattern Candidates Subtree 23.34 32.73 27.25 Alignment 23.35 45.62 30.89 Mi-1,j-1 + sim i-1,j-1 * wi-1  Selecting all pattern candidates for test Mi-1,j + gp * wi-1  Without pattern filtering  Our proposed model achieved much Mi,j = max Mi,j-1 + gp * wi  To compare not the pattern filtering higher recall than the other models with 0 method, but the representative performance similar precision among pattern models