SlideShare a Scribd company logo
An Alignment-based Pattern Representation Model
                                              for Information Extraction
                                                                         Seokhwan Kim, Minwoo Jeong, Gary Geunbae Lee
                                                                             {megaup, stardust, gblee}@postech.ac.kr
 Abstract - In this paper, we propose an alternative pattern representation model and the effective method of utilizing it. While the previous pattern
 representation models completely depend on the result of dependency analysis, our approach is basically based on the lexical alignment and
 considers the result of dependency analysis only as a meaningful feature of the alignment process. In this way, we can cope with the errors of
 incomplete dependency analysis. An evaluation of a scenario template task shows that our proposed model outperforms the previous
 syntax-dependent models.


                                                                  Pattern Representation Model for Information Extraction
 Information Extraction                                                               Pattern Representation Model for IE                                        Related Works
 Extracting the defined number of relevant                                            Problem Definition                                                          Lexical Sequence Pattern Models
arguments from natural language documents                                                    Ex)                                                                     A set of lexical sequences
 Subtasks                                                                                                                                                                   <hum_tgt> be kidnapped
                                                                                           About 50 peasants have been kidnapped by terrorists of the FMNL
                                                                                                                                                                             be kidnapped by <prep_ind>
 # of arguments                         subtask
        1                      named-entity recognition                                                                             Extraction Pattern                Syntax-dependent Pattern Models
        2                      binary relation extraction                                                                                   ?                           A set of subtrees (from D-tree)
   more than 2                 relation/event extraction
                                                                                                             incident type   kidnapping                                                      kidnapped
 Approach                                                                                                   prep_ind        terrorists                                         nsubjpass                        agent


   Automatic Pattern Learning                                                                               prep_org        FMNL                                               peasants                  terrorists
                                                                                                             hum_tgt         peasants
     Pattern Representation Model                                                                                                                                                                                       prep_of


     Pattern Learning Algorithm                                                                                                                                    (kidnapped ({HUM_TGT}-nsubjpass)    FMNL
                                                                                                                                                                    (kidnapped ({PREP_IND}-agent))
                                                                                                                                                                    (kidnapped ({PREP_IND}-agent ({PREP_ORG}-prep_of)))

                                                                                                                    Method
 Our Approach                                                                                        Pattern Sequences Extraction
  Pattern Model                                                                                     1) Searching the sentences containing all                    Ex)                      (3+1)/(0+1) = 4
    Lexical Sequence Pattern                                                                        arguments of each tuple in source documents
                                                                                                     2) Segmenting out subpart of the sentence                                    kidnapped
    + Term Weight (from Dependency Analysis)
                                                                                                     based on clausal boundaries                                nsubjpass                                agent
      <HUM_TGT>         of      [NP]     have    been      kidnapped   by     <PRED_IND>
                                                                                                     3) Replacing the parts of arguments in the
                                                                                                                                                                                (1+1)/(1+1) = 1                    (2+1)/(1+1) = 1.5
           1           0.33     0.33      4          4        4        1.5           1.5
                                                                                                     sub-sentence with argument labels
                                                                                                      Computing Term Weights                                  <HUM_TGT>                       <PREP_IND>
   Soft Pattern Matching
     Sequence Alignment                                                                                  wi = (ri + c) / (di + c)                              prep_of                                       prep_of

     about 50 peasants          have          been       kidnapped     by         terrorists              wi : weight of i-th term
                                                                                                          ri : number of relevant terms within                      [NP]                         <PREP_ORG>
                                                                                                              a subtree, ti as root
  <HUM_TGT>       of         [NP]      have      been      kidnapped         by     <PREP_IND>            di : distance from root node                     (0+1)/(2+1) = 0.33                            (1+1)/(2+1) =0.67
                                                                                                          c : for smoothing (default:1)
                                                                                                                                                         Experiment
 Pattern Matching
  Sequence Alignment                                                                                Experimental Setup                            Experimental Result
  Based on a Dynamic Programming                                                                     Data                                          Pattern Models
  Alignment Matrix                                                                                     MUC-3/4 Data                                  SVO Model (Yangarber ‘00)
               peasants have been kidnapped by terrorists                                                 About the Terrorism Events                  Linked-Chain Model (Greenwood ‘06)
   <HUM_TGT>      1       0   0        0     0     0
         of       0       1   0        0     0     0
                                                                                                        Simpler template structure with 4 slots       Subtree Model (Sudo ‘03)
        [NP]      0     0.66 1         0     0     0                                                      perp_ind, perp_org, phys_tgt, hum_tgt       Our Model
        have      0       4   3        2     1     0
        been      0       0   8        7     6     5                                                    Dev-set (training), Test-set (evaluation)   Result
     kidnapped    0       0   4       12     11    10                                                 Preprocessing                                      Model      Precision Recall F-measure
         by       0       0   0        8    13.5  12.5
   <PRED_IND>    1.5     0.5  0        0     0     15                                                   Dependency Parsing and NP-chunking                SVO        21.74    20.62    21.16

                                                                                                          Stanford Parser                             Linked-Chain   20.04    26.55    22.84
   Matrix Computation                                                                                Extracting Pattern Candidates                      Subtree     23.34    32.73    27.25
                                                                                                                                                        Alignment     23.35    45.62    30.89
                                       Mi-1,j-1 + sim i-1,j-1 * wi-1                                    Selecting all pattern candidates for test
                                       Mi-1,j + gp * wi-1                                               Without pattern filtering                     Our proposed model achieved much
          Mi,j = max
                                       Mi,j-1 + gp * wi                                                 To compare not the pattern filtering         higher recall than the other models with
                                       0                                                                  method, but the representative performance                      similar precision
                                                                                                          among pattern models

More Related Content

What's hot

Detecting aspect-specific code smells using Ekeko for AspectJ
Detecting aspect-specific code smells using Ekeko for AspectJDetecting aspect-specific code smells using Ekeko for AspectJ
Detecting aspect-specific code smells using Ekeko for AspectJ
Coen De Roover
 
Dart function - Recursive functions
Dart function - Recursive functionsDart function - Recursive functions
Dart function - Recursive functions
KoAungThuOo1
 
Ekeko Technology Showdown at SoTeSoLa 2012
Ekeko Technology Showdown at SoTeSoLa 2012Ekeko Technology Showdown at SoTeSoLa 2012
Ekeko Technology Showdown at SoTeSoLa 2012Coen De Roover
 
JSTLQuick Reference
JSTLQuick ReferenceJSTLQuick Reference
JSTLQuick Reference
Hicham QAISSI
 
The SOUL Tool Suite for Querying Programs in Symbiosis with Eclipse
The SOUL Tool Suite for Querying Programs in Symbiosis with EclipseThe SOUL Tool Suite for Querying Programs in Symbiosis with Eclipse
The SOUL Tool Suite for Querying Programs in Symbiosis with Eclipse
Coen De Roover
 
django095-cheat-sheet
django095-cheat-sheetdjango095-cheat-sheet
django095-cheat-sheetwebuploader
 
Scientific computing
Scientific computingScientific computing
Scientific computingBrijesh Kumar
 

What's hot (9)

Detecting aspect-specific code smells using Ekeko for AspectJ
Detecting aspect-specific code smells using Ekeko for AspectJDetecting aspect-specific code smells using Ekeko for AspectJ
Detecting aspect-specific code smells using Ekeko for AspectJ
 
Dart function - Recursive functions
Dart function - Recursive functionsDart function - Recursive functions
Dart function - Recursive functions
 
Ekeko Technology Showdown at SoTeSoLa 2012
Ekeko Technology Showdown at SoTeSoLa 2012Ekeko Technology Showdown at SoTeSoLa 2012
Ekeko Technology Showdown at SoTeSoLa 2012
 
JSTLQuick Reference
JSTLQuick ReferenceJSTLQuick Reference
JSTLQuick Reference
 
The SOUL Tool Suite for Querying Programs in Symbiosis with Eclipse
The SOUL Tool Suite for Querying Programs in Symbiosis with EclipseThe SOUL Tool Suite for Querying Programs in Symbiosis with Eclipse
The SOUL Tool Suite for Querying Programs in Symbiosis with Eclipse
 
Discovery Of Functional Protein Linear Motifs Using a Greaddy Algorithm and I...
Discovery Of Functional Protein Linear Motifs Using a Greaddy Algorithm and I...Discovery Of Functional Protein Linear Motifs Using a Greaddy Algorithm and I...
Discovery Of Functional Protein Linear Motifs Using a Greaddy Algorithm and I...
 
django095-cheat-sheet
django095-cheat-sheetdjango095-cheat-sheet
django095-cheat-sheet
 
Chtp405
Chtp405Chtp405
Chtp405
 
Scientific computing
Scientific computingScientific computing
Scientific computing
 

Viewers also liked

Presenter and Decorator in Rails
Presenter and Decorator in RailsPresenter and Decorator in Rails
Presenter and Decorator in Rails
Thaichor Seng
 
Functional reactive programming
Functional reactive programmingFunctional reactive programming
Functional reactive programming
Araf Karsh Hamid
 
Phani Kumar - Decorator Pattern
Phani Kumar - Decorator PatternPhani Kumar - Decorator Pattern
Phani Kumar - Decorator Pattern
melbournepatterns
 
Builder Design Pattern (Generic Construction -Different Representation)
Builder Design Pattern (Generic Construction -Different Representation)Builder Design Pattern (Generic Construction -Different Representation)
Builder Design Pattern (Generic Construction -Different Representation)
Sameer Rathoud
 
How I Learned To Apply Design Patterns
How I Learned To Apply Design PatternsHow I Learned To Apply Design Patterns
How I Learned To Apply Design Patterns
Andy Maleh
 
Pattern Languages — An Approach to Holistic Knowledge Representation
Pattern Languages — An Approach to Holistic Knowledge RepresentationPattern Languages — An Approach to Holistic Knowledge Representation
Pattern Languages — An Approach to Holistic Knowledge Representation
Douglas Schuler
 
Observer pattern
Observer patternObserver pattern
Observer pattern
Samreen Farooq
 
Software Design Patterns - Selecting the right design pattern
Software Design Patterns - Selecting the right design patternSoftware Design Patterns - Selecting the right design pattern
Software Design Patterns - Selecting the right design pattern
Joao Pereira
 
Decorator Design Pattern
Decorator Design PatternDecorator Design Pattern
Decorator Design Pattern
Adeel Riaz
 
Design pattern
Design patternDesign pattern
Design pattern
Thibaut De Broca
 

Viewers also liked (10)

Presenter and Decorator in Rails
Presenter and Decorator in RailsPresenter and Decorator in Rails
Presenter and Decorator in Rails
 
Functional reactive programming
Functional reactive programmingFunctional reactive programming
Functional reactive programming
 
Phani Kumar - Decorator Pattern
Phani Kumar - Decorator PatternPhani Kumar - Decorator Pattern
Phani Kumar - Decorator Pattern
 
Builder Design Pattern (Generic Construction -Different Representation)
Builder Design Pattern (Generic Construction -Different Representation)Builder Design Pattern (Generic Construction -Different Representation)
Builder Design Pattern (Generic Construction -Different Representation)
 
How I Learned To Apply Design Patterns
How I Learned To Apply Design PatternsHow I Learned To Apply Design Patterns
How I Learned To Apply Design Patterns
 
Pattern Languages — An Approach to Holistic Knowledge Representation
Pattern Languages — An Approach to Holistic Knowledge RepresentationPattern Languages — An Approach to Holistic Knowledge Representation
Pattern Languages — An Approach to Holistic Knowledge Representation
 
Observer pattern
Observer patternObserver pattern
Observer pattern
 
Software Design Patterns - Selecting the right design pattern
Software Design Patterns - Selecting the right design patternSoftware Design Patterns - Selecting the right design pattern
Software Design Patterns - Selecting the right design pattern
 
Decorator Design Pattern
Decorator Design PatternDecorator Design Pattern
Decorator Design Pattern
 
Design pattern
Design patternDesign pattern
Design pattern
 

More from Seokhwan Kim

The Eighth Dialog System Technology Challenge (DSTC8)
The Eighth Dialog System Technology Challenge (DSTC8)The Eighth Dialog System Technology Challenge (DSTC8)
The Eighth Dialog System Technology Challenge (DSTC8)
Seokhwan Kim
 
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
Seokhwan Kim
 
Dynamic Memory Networks for Dialogue Topic Tracking
Dynamic Memory Networks for Dialogue Topic TrackingDynamic Memory Networks for Dialogue Topic Tracking
Dynamic Memory Networks for Dialogue Topic Tracking
Seokhwan Kim
 
The Fifth Dialog State Tracking Challenge (DSTC5)
The Fifth Dialog State Tracking Challenge (DSTC5)The Fifth Dialog State Tracking Challenge (DSTC5)
The Fifth Dialog State Tracking Challenge (DSTC5)
Seokhwan Kim
 
Natural Language in Human-Robot Interaction
Natural Language in Human-Robot InteractionNatural Language in Human-Robot Interaction
Natural Language in Human-Robot Interaction
Seokhwan Kim
 
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...
Seokhwan Kim
 
The Fourth Dialog State Tracking Challenge (DSTC4)
The Fourth Dialog State Tracking Challenge (DSTC4)The Fourth Dialog State Tracking Challenge (DSTC4)
The Fourth Dialog State Tracking Challenge (DSTC4)
Seokhwan Kim
 
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
Seokhwan Kim
 
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
Seokhwan Kim
 
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...
Seokhwan Kim
 
Sequential Labeling for Tracking Dynamic Dialog States
Sequential Labeling for Tracking Dynamic Dialog StatesSequential Labeling for Tracking Dynamic Dialog States
Sequential Labeling for Tracking Dynamic Dialog States
Seokhwan Kim
 
Wikipedia-based Kernels for Dialogue Topic Tracking
Wikipedia-based Kernels for Dialogue Topic TrackingWikipedia-based Kernels for Dialogue Topic Tracking
Wikipedia-based Kernels for Dialogue Topic TrackingSeokhwan Kim
 
A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...
A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...
A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...Seokhwan Kim
 
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...Seokhwan Kim
 
MMR-based active machine learning for Bio named entity recognition
MMR-based active machine learning for Bio named entity recognitionMMR-based active machine learning for Bio named entity recognition
MMR-based active machine learning for Bio named entity recognitionSeokhwan Kim
 
A semi-supervised method for efficient construction of statistical spoken lan...
A semi-supervised method for efficient construction of statistical spoken lan...A semi-supervised method for efficient construction of statistical spoken lan...
A semi-supervised method for efficient construction of statistical spoken lan...Seokhwan Kim
 
A spoken dialog system for electronic program guide information access
A spoken dialog system for electronic program guide information accessA spoken dialog system for electronic program guide information access
A spoken dialog system for electronic program guide information accessSeokhwan Kim
 
An alignment-based approach to semi-supervised relation extraction including ...
An alignment-based approach to semi-supervised relation extraction including ...An alignment-based approach to semi-supervised relation extraction including ...
An alignment-based approach to semi-supervised relation extraction including ...Seokhwan Kim
 
EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템
EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템
EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템Seokhwan Kim
 
A Cross-Lingual Annotation Projection Approach for Relation Detection
A Cross-Lingual Annotation Projection Approach for Relation DetectionA Cross-Lingual Annotation Projection Approach for Relation Detection
A Cross-Lingual Annotation Projection Approach for Relation DetectionSeokhwan Kim
 

More from Seokhwan Kim (20)

The Eighth Dialog System Technology Challenge (DSTC8)
The Eighth Dialog System Technology Challenge (DSTC8)The Eighth Dialog System Technology Challenge (DSTC8)
The Eighth Dialog System Technology Challenge (DSTC8)
 
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
 
Dynamic Memory Networks for Dialogue Topic Tracking
Dynamic Memory Networks for Dialogue Topic TrackingDynamic Memory Networks for Dialogue Topic Tracking
Dynamic Memory Networks for Dialogue Topic Tracking
 
The Fifth Dialog State Tracking Challenge (DSTC5)
The Fifth Dialog State Tracking Challenge (DSTC5)The Fifth Dialog State Tracking Challenge (DSTC5)
The Fifth Dialog State Tracking Challenge (DSTC5)
 
Natural Language in Human-Robot Interaction
Natural Language in Human-Robot InteractionNatural Language in Human-Robot Interaction
Natural Language in Human-Robot Interaction
 
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...
 
The Fourth Dialog State Tracking Challenge (DSTC4)
The Fourth Dialog State Tracking Challenge (DSTC4)The Fourth Dialog State Tracking Challenge (DSTC4)
The Fourth Dialog State Tracking Challenge (DSTC4)
 
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
 
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
 
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...
 
Sequential Labeling for Tracking Dynamic Dialog States
Sequential Labeling for Tracking Dynamic Dialog StatesSequential Labeling for Tracking Dynamic Dialog States
Sequential Labeling for Tracking Dynamic Dialog States
 
Wikipedia-based Kernels for Dialogue Topic Tracking
Wikipedia-based Kernels for Dialogue Topic TrackingWikipedia-based Kernels for Dialogue Topic Tracking
Wikipedia-based Kernels for Dialogue Topic Tracking
 
A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...
A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...
A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...
 
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
 
MMR-based active machine learning for Bio named entity recognition
MMR-based active machine learning for Bio named entity recognitionMMR-based active machine learning for Bio named entity recognition
MMR-based active machine learning for Bio named entity recognition
 
A semi-supervised method for efficient construction of statistical spoken lan...
A semi-supervised method for efficient construction of statistical spoken lan...A semi-supervised method for efficient construction of statistical spoken lan...
A semi-supervised method for efficient construction of statistical spoken lan...
 
A spoken dialog system for electronic program guide information access
A spoken dialog system for electronic program guide information accessA spoken dialog system for electronic program guide information access
A spoken dialog system for electronic program guide information access
 
An alignment-based approach to semi-supervised relation extraction including ...
An alignment-based approach to semi-supervised relation extraction including ...An alignment-based approach to semi-supervised relation extraction including ...
An alignment-based approach to semi-supervised relation extraction including ...
 
EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템
EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템
EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템
 
A Cross-Lingual Annotation Projection Approach for Relation Detection
A Cross-Lingual Annotation Projection Approach for Relation DetectionA Cross-Lingual Annotation Projection Approach for Relation Detection
A Cross-Lingual Annotation Projection Approach for Relation Detection
 

Recently uploaded

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 

Recently uploaded (20)

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 

An Alignment-based Pattern Representation Model for Information Extraction

  • 1. An Alignment-based Pattern Representation Model for Information Extraction Seokhwan Kim, Minwoo Jeong, Gary Geunbae Lee {megaup, stardust, gblee}@postech.ac.kr Abstract - In this paper, we propose an alternative pattern representation model and the effective method of utilizing it. While the previous pattern representation models completely depend on the result of dependency analysis, our approach is basically based on the lexical alignment and considers the result of dependency analysis only as a meaningful feature of the alignment process. In this way, we can cope with the errors of incomplete dependency analysis. An evaluation of a scenario template task shows that our proposed model outperforms the previous syntax-dependent models. Pattern Representation Model for Information Extraction  Information Extraction  Pattern Representation Model for IE  Related Works  Extracting the defined number of relevant  Problem Definition  Lexical Sequence Pattern Models arguments from natural language documents  Ex)  A set of lexical sequences  Subtasks <hum_tgt> be kidnapped About 50 peasants have been kidnapped by terrorists of the FMNL be kidnapped by <prep_ind> # of arguments subtask 1 named-entity recognition Extraction Pattern  Syntax-dependent Pattern Models 2 binary relation extraction ?  A set of subtrees (from D-tree) more than 2 relation/event extraction incident type kidnapping kidnapped  Approach prep_ind terrorists nsubjpass agent  Automatic Pattern Learning prep_org FMNL peasants terrorists hum_tgt peasants  Pattern Representation Model prep_of  Pattern Learning Algorithm (kidnapped ({HUM_TGT}-nsubjpass) FMNL (kidnapped ({PREP_IND}-agent)) (kidnapped ({PREP_IND}-agent ({PREP_ORG}-prep_of))) Method  Our Approach  Pattern Sequences Extraction  Pattern Model 1) Searching the sentences containing all  Ex) (3+1)/(0+1) = 4  Lexical Sequence Pattern arguments of each tuple in source documents 2) Segmenting out subpart of the sentence kidnapped  + Term Weight (from Dependency Analysis) based on clausal boundaries nsubjpass agent <HUM_TGT> of [NP] have been kidnapped by <PRED_IND> 3) Replacing the parts of arguments in the (1+1)/(1+1) = 1 (2+1)/(1+1) = 1.5 1 0.33 0.33 4 4 4 1.5 1.5 sub-sentence with argument labels  Computing Term Weights <HUM_TGT> <PREP_IND>  Soft Pattern Matching  Sequence Alignment wi = (ri + c) / (di + c) prep_of prep_of about 50 peasants have been kidnapped by terrorists wi : weight of i-th term ri : number of relevant terms within [NP] <PREP_ORG> a subtree, ti as root <HUM_TGT> of [NP] have been kidnapped by <PREP_IND> di : distance from root node (0+1)/(2+1) = 0.33 (1+1)/(2+1) =0.67 c : for smoothing (default:1) Experiment  Pattern Matching  Sequence Alignment  Experimental Setup  Experimental Result  Based on a Dynamic Programming  Data  Pattern Models  Alignment Matrix  MUC-3/4 Data  SVO Model (Yangarber ‘00) peasants have been kidnapped by terrorists  About the Terrorism Events  Linked-Chain Model (Greenwood ‘06) <HUM_TGT> 1 0 0 0 0 0 of 0 1 0 0 0 0  Simpler template structure with 4 slots  Subtree Model (Sudo ‘03) [NP] 0 0.66 1 0 0 0  perp_ind, perp_org, phys_tgt, hum_tgt  Our Model have 0 4 3 2 1 0 been 0 0 8 7 6 5  Dev-set (training), Test-set (evaluation)  Result kidnapped 0 0 4 12 11 10  Preprocessing Model Precision Recall F-measure by 0 0 0 8 13.5 12.5 <PRED_IND> 1.5 0.5 0 0 0 15  Dependency Parsing and NP-chunking SVO 21.74 20.62 21.16  Stanford Parser Linked-Chain 20.04 26.55 22.84  Matrix Computation  Extracting Pattern Candidates Subtree 23.34 32.73 27.25 Alignment 23.35 45.62 30.89 Mi-1,j-1 + sim i-1,j-1 * wi-1  Selecting all pattern candidates for test Mi-1,j + gp * wi-1  Without pattern filtering  Our proposed model achieved much Mi,j = max Mi,j-1 + gp * wi  To compare not the pattern filtering higher recall than the other models with 0 method, but the representative performance similar precision among pattern models