SlideShare a Scribd company logo
1 of 21
Download to read offline
TIMEN
An Open Temporal Expression
   Normalisation Resource




H.Llorens, L.Derczynski, R.Gaizauskas, E. Saquete
Outline
●   Introduction: Timex normalisation
●   Related work
●   Problem: reinventing the wheel once and again

●   Proposal: TIMEN
●   Evaluation
●   Conclusions
●   Further Work
Timex Normalisation
Temporal information extraction subtask.

Timex: linguistic expression of a time point or interval.

Normalisation: semantic interpretation of timexes.
Temporal Expression (TIMEX)          Timex normalization
Linguistics/Variability/Relativity   ISO 8601/Invariable interpretation
June 2012, next month, 06/2012       2012-06
this morning 7 a.m.                  2012-05-24T07:00
3 days and 3 hours                   PT3D3H
weekly                               XXXX-XX-WXX
Timex Normalisation (II)
Useful for a variety of NLP applications: IR, QA,
Summarization, etc.

           I went to the cinema yesterday.
             event                    timex
                                 Value: 2012-05-23

     When did he go to the cinema? 2012-05-23

The main advantage of normalisation is having timexes in
standard time representations (e.g., gregorian calendar).
Related Work
There are many approaches to timex normalisation

● Pre TempEval-2
  ○ TempEx (2000), GUTime (2005), Chronos (2004),
     TERSEO (2005), TimexTag (2005), TEA (2006),
     DANTE (2007)...
● TempEval-2 (2010)
  ○ HeidelTime, TRIPS/TRIOS, TIPSem/TIPSemB...
Similarities and differences
● Approaches have slightly different architectures and
   show slightly different performances on tests.

● But all the approaches are rule-based and in general
   they use the same normalization strategies.

● & also require the same parameters to perform the task.
   ○   DCT: document creation time (deictic) (2 days ago: 2012-05-22)
   ○   Reference time: time talked about (anaphoric)
       (2 days before: 2012-05-20)
   ○   Tense: Resolution direction (October)
       Past (2011-10), Present/Future (2012-10)
The problem
Reinventing the wheel once and again
● Implementation of high-performance approaches is
  costly and it is done all the times from the scratch.
● all the approaches are similar: rule-based with similar
  normalization rules and strategies.
● none is meant to be reused and refined by others.
Proposal: TIMEN
Characteristics:
 ● Open philosophy: meant to be reused and refined (even
   across languages)

 ●   Not only meant for computer scientists:
      ○   the algorithms (source code) and normalisation rules (db of user-
          friendly rules with a documented syntax) are separated.

 ●   Independent from other timex processing tasks

 ●   Multi-platform and easy integration
TIMEN Library Architecture
Example:
timex: three days ago
DCT:2012-05-24
normtext: 3_day_ago
pattern: Num_TUnit_ago
only 1 rule matches.
normalized value: 2012-05-21




Example2:
timex: October 20
2 rules matching
disambiguation
20 probably a day
rather than a year
because <32
Rule base sample (English)
TIMEN integration
TIMEN community
● Open-source software:
    http://code.google.com/p/timen/



● Crowd extension of the rule set (interactive
  web interface to upload and check new
  rules): http//timen.org

* new rules only accepted if they improve the performance on the current
dataset or new examples (human reviewed). Eg: New Year's Eve
Evaluation
Experiments:
● Normalization accuracy of TIMEN
● Performance gain in s-o-a approaches by
  integrating TIMEN
Datasets:
● TempEval-2 test-set
  (already known for approaches, mainly common dates and duration)
● TimenEval dataset
  (new, unknown for appr., balanced among different timex types)
Normalisation accuracy

        gold timexes                   normalisation
        yesterday                      2012-05-23
        2012                                             correct
                                       2012              correct
        October                        2012-10
        daily                                            incorrect
                         TIMEN         xxxx-xx-xx        correct
        morning                        2011              incorrect
        ...                            ...               ...


e.g. TOTAL: 100 timexes to normalise   e.g. TOTAL: 90 correct normalizations


         RESULT: 90/100 --> 90% ACCURACY
Normalisation accuracy
         TEST SET          NORMALISAION ACC
         TempEval-2               0.90
         TimenEval                0.68


● TIMEN shows a high performance even in this first
  version (only 76 rules).

● TimenEval accuracy is lower. This corpus is more
  heterogeneous (times/sets) and normalization is more
  difficult.
Performance gain
                    built-in
                                   Original
                normalisation
Approach X                         normalisation
                of Approach X
recognized
timexes                             New
                   TIMEN
                                    normalisation



Performance gain = New accuracy - Original accuracy
Performance gain
(TempEval-2) "known data"
   System       built-in norm.   TIMEN norm.   Err. Redution
   TIPSemB           0.83            0.89           35%
   HeidelTime        0.94            0.94           0%
   TERNIP            0.76            0.92           66%

● Replacing built-in normalization approaches of the
  systems by TIMEN generally improves their
  performance in TE2 testset.
● Tested (current) versions of the systems may have
  been developed/updated being aware of this data. What
  does it happen with data which is new for them?
Performance gain
(TimenEval) "new data"
   System       built-in norm.   TIMEN norm.   Err. Redution
   TIPSemB           0.57            0.67           23%
   HeidelTime        0.72            0.74           7%
   TERNIP            0.70            0.72           66%



● Using new data, the built-in approaches performance
  decreases in general.
● TIMEN favours the normalization performance for all the
  systems.
Conclusions
● We presented an open tool for timex normalisation:
  TIMEN.

● ADVANTAGES:
  ○ High performance (above recent approaches).
  ○ Easily integrated in any timex recognition
    approach.
  ○ Can be improved by the community (open philosophy),
    and avoids re-development from scratch.
  ○ Available: http://timen.org and Google code
Further Work

● Community-based extension and refinement
  of TIMEN (rulebase).

● Extensive evaluation of TIMEN in various
  languages (Spanish, Chinese, Italian and Danish).
TIMEN: An Open TIMEX Normalisation Resource

              THANK YOU!
                   QUESTIONS?

                   http://timen.org

       H.Llorens, L.Derczynski, R.Gaizauskas, E. Saquete

More Related Content

Viewers also liked

Viewers also liked (10)

Normalization
NormalizationNormalization
Normalization
 
Normalization in databases
Normalization in databasesNormalization in databases
Normalization in databases
 
Normalisation
NormalisationNormalisation
Normalisation
 
Dbms and sqlpptx
Dbms and sqlpptxDbms and sqlpptx
Dbms and sqlpptx
 
Normalisation - 2nd normal form
Normalisation - 2nd normal formNormalisation - 2nd normal form
Normalisation - 2nd normal form
 
Normalization
NormalizationNormalization
Normalization
 
DBMS - Normalization
DBMS - NormalizationDBMS - Normalization
DBMS - Normalization
 
Databases: Normalisation
Databases: NormalisationDatabases: Normalisation
Databases: Normalisation
 
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NFDatabase Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
 
Database design & Normalization (1NF, 2NF, 3NF)
Database design & Normalization (1NF, 2NF, 3NF)Database design & Normalization (1NF, 2NF, 3NF)
Database design & Normalization (1NF, 2NF, 3NF)
 

Similar to TIMEN: An Open Temporal Expression Normalisation Resource

fpbm- pg subject in Construction Managament
fpbm- pg subject in Construction Managamentfpbm- pg subject in Construction Managament
fpbm- pg subject in Construction Managamentdeepika977036
 
Crating a Robust Performance Strategy
Crating a Robust Performance StrategyCrating a Robust Performance Strategy
Crating a Robust Performance StrategyGuatemala User Group
 
The Prediction Of Time Trending Techniques. Is It A Reasonable Estimate?
The Prediction Of Time Trending Techniques. Is It A Reasonable Estimate?The Prediction Of Time Trending Techniques. Is It A Reasonable Estimate?
The Prediction Of Time Trending Techniques. Is It A Reasonable Estimate?Gan Chun Chet
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.pptArumugam90
 
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.Sunghoon Joo
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya
 
The hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusThe hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusBol.com Techlab
 
The hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusThe hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusBol.com Techlab
 
Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.Wuhyun Rico Shin
 
Industrial plant optimization in reduced dimensional spaces
Industrial plant optimization in reduced dimensional spacesIndustrial plant optimization in reduced dimensional spaces
Industrial plant optimization in reduced dimensional spacesCapstone
 
Evolving Reinforcement Learning Algorithms, JD. Co-Reyes et al, 2021
Evolving Reinforcement Learning Algorithms, JD. Co-Reyes et al, 2021Evolving Reinforcement Learning Algorithms, JD. Co-Reyes et al, 2021
Evolving Reinforcement Learning Algorithms, JD. Co-Reyes et al, 2021Chris Ohk
 
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryAhmed Yousry
 
A Mixed Discrete-Continuous Attribute List Representation for Large Scale Cla...
A Mixed Discrete-Continuous Attribute List Representation for Large Scale Cla...A Mixed Discrete-Continuous Attribute List Representation for Large Scale Cla...
A Mixed Discrete-Continuous Attribute List Representation for Large Scale Cla...jaumebp
 

Similar to TIMEN: An Open Temporal Expression Normalisation Resource (20)

fpbm- pg subject in Construction Managament
fpbm- pg subject in Construction Managamentfpbm- pg subject in Construction Managament
fpbm- pg subject in Construction Managament
 
Crating a Robust Performance Strategy
Crating a Robust Performance StrategyCrating a Robust Performance Strategy
Crating a Robust Performance Strategy
 
Temporal Data
Temporal DataTemporal Data
Temporal Data
 
Scheduling
SchedulingScheduling
Scheduling
 
Tale-of-math-and-scalability.pdf
Tale-of-math-and-scalability.pdfTale-of-math-and-scalability.pdf
Tale-of-math-and-scalability.pdf
 
The Prediction Of Time Trending Techniques. Is It A Reasonable Estimate?
The Prediction Of Time Trending Techniques. Is It A Reasonable Estimate?The Prediction Of Time Trending Techniques. Is It A Reasonable Estimate?
The Prediction Of Time Trending Techniques. Is It A Reasonable Estimate?
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.ppt
 
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
PR-373: Revisiting ResNets: Improved Training and Scaling Strategies.
 
Searching Algorithms
Searching AlgorithmsSearching Algorithms
Searching Algorithms
 
Design of Work Systems
Design of Work SystemsDesign of Work Systems
Design of Work Systems
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
The hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusThe hitchhiker’s guide to Prometheus
The hitchhiker’s guide to Prometheus
 
The hitchhiker’s guide to Prometheus
The hitchhiker’s guide to PrometheusThe hitchhiker’s guide to Prometheus
The hitchhiker’s guide to Prometheus
 
Prometheus monitoring
Prometheus monitoringPrometheus monitoring
Prometheus monitoring
 
Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.
 
Module Owb Tuning
Module Owb TuningModule Owb Tuning
Module Owb Tuning
 
Industrial plant optimization in reduced dimensional spaces
Industrial plant optimization in reduced dimensional spacesIndustrial plant optimization in reduced dimensional spaces
Industrial plant optimization in reduced dimensional spaces
 
Evolving Reinforcement Learning Algorithms, JD. Co-Reyes et al, 2021
Evolving Reinforcement Learning Algorithms, JD. Co-Reyes et al, 2021Evolving Reinforcement Learning Algorithms, JD. Co-Reyes et al, 2021
Evolving Reinforcement Learning Algorithms, JD. Co-Reyes et al, 2021
 
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousryHands on machine learning with scikit-learn and tensor flow by ahmed yousry
Hands on machine learning with scikit-learn and tensor flow by ahmed yousry
 
A Mixed Discrete-Continuous Attribute List Representation for Large Scale Cla...
A Mixed Discrete-Continuous Attribute List Representation for Large Scale Cla...A Mixed Discrete-Continuous Attribute List Representation for Large Scale Cla...
A Mixed Discrete-Continuous Attribute List Representation for Large Scale Cla...
 

More from Leon Derczynski

Joint Rumour Stance and Veracity
Joint Rumour Stance and VeracityJoint Rumour Stance and Veracity
Joint Rumour Stance and VeracityLeon Derczynski
 
State of Tools for NLP in Danish: 2018
State of Tools for NLP in Danish: 2018State of Tools for NLP in Danish: 2018
State of Tools for NLP in Danish: 2018Leon Derczynski
 
Broad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition ResourceBroad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition ResourceLeon Derczynski
 
Handling and Mining Linguistic Variation in UGC
Handling and Mining Linguistic Variation in UGCHandling and Mining Linguistic Variation in UGC
Handling and Mining Linguistic Variation in UGCLeon Derczynski
 
Efficient named entity annotation through pre-empting
Efficient named entity annotation through pre-emptingEfficient named entity annotation through pre-empting
Efficient named entity annotation through pre-emptingLeon Derczynski
 
Leveraging the Power of Social Media
Leveraging the Power of Social MediaLeveraging the Power of Social Media
Leveraging the Power of Social MediaLeon Derczynski
 
Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
Corpus Annotation through Crowdsourcing: Towards Best Practice GuidelinesCorpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
Corpus Annotation through Crowdsourcing: Towards Best Practice GuidelinesLeon Derczynski
 
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...Leon Derczynski
 
Starting to Process Social Media
Starting to Process Social MediaStarting to Process Social Media
Starting to Process Social MediaLeon Derczynski
 
Christmas Presentation at Aarhus: What I do
Christmas Presentation at Aarhus: What I doChristmas Presentation at Aarhus: What I do
Christmas Presentation at Aarhus: What I doLeon Derczynski
 
Recognising and Interpreting Named Temporal Expressions
Recognising and Interpreting Named Temporal ExpressionsRecognising and Interpreting Named Temporal Expressions
Recognising and Interpreting Named Temporal ExpressionsLeon Derczynski
 
TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog TextTwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog TextLeon Derczynski
 
Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data
 Twitter Part-of-Speech Tagging for All:  Overcoming Sparse and Noisy Data Twitter Part-of-Speech Tagging for All:  Overcoming Sparse and Noisy Data
Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy DataLeon Derczynski
 
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Leon Derczynski
 
Determining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in DiscourseDetermining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in DiscourseLeon Derczynski
 
Microblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracyMicroblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracyLeon Derczynski
 
Empirical Validation of Reichenbach’s Tense Framework
Empirical Validation of Reichenbach’s Tense FrameworkEmpirical Validation of Reichenbach’s Tense Framework
Empirical Validation of Reichenbach’s Tense FrameworkLeon Derczynski
 
Towards Context-Aware Search and Analysis on Social Media Data
Towards Context-Aware Search and Analysis on Social Media DataTowards Context-Aware Search and Analysis on Social Media Data
Towards Context-Aware Search and Analysis on Social Media DataLeon Derczynski
 
Determining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in DiscourseDetermining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in DiscourseLeon Derczynski
 

More from Leon Derczynski (20)

Joint Rumour Stance and Veracity
Joint Rumour Stance and VeracityJoint Rumour Stance and Veracity
Joint Rumour Stance and Veracity
 
State of Tools for NLP in Danish: 2018
State of Tools for NLP in Danish: 2018State of Tools for NLP in Danish: 2018
State of Tools for NLP in Danish: 2018
 
RumourEval
RumourEvalRumourEval
RumourEval
 
Broad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition ResourceBroad Twitter Corpus: A Diverse Named Entity Recognition Resource
Broad Twitter Corpus: A Diverse Named Entity Recognition Resource
 
Handling and Mining Linguistic Variation in UGC
Handling and Mining Linguistic Variation in UGCHandling and Mining Linguistic Variation in UGC
Handling and Mining Linguistic Variation in UGC
 
Efficient named entity annotation through pre-empting
Efficient named entity annotation through pre-emptingEfficient named entity annotation through pre-empting
Efficient named entity annotation through pre-empting
 
Leveraging the Power of Social Media
Leveraging the Power of Social MediaLeveraging the Power of Social Media
Leveraging the Power of Social Media
 
Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
Corpus Annotation through Crowdsourcing: Towards Best Practice GuidelinesCorpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
 
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
Passive-Aggressive Sequence Labeling with Discriminative Post-Editing for Rec...
 
Starting to Process Social Media
Starting to Process Social MediaStarting to Process Social Media
Starting to Process Social Media
 
Christmas Presentation at Aarhus: What I do
Christmas Presentation at Aarhus: What I doChristmas Presentation at Aarhus: What I do
Christmas Presentation at Aarhus: What I do
 
Recognising and Interpreting Named Temporal Expressions
Recognising and Interpreting Named Temporal ExpressionsRecognising and Interpreting Named Temporal Expressions
Recognising and Interpreting Named Temporal Expressions
 
TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog TextTwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
TwitIE: An Open-Source Information Extraction Pipeline for Microblog Text
 
Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data
 Twitter Part-of-Speech Tagging for All:  Overcoming Sparse and Noisy Data Twitter Part-of-Speech Tagging for All:  Overcoming Sparse and Noisy Data
Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data
 
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
Mining Social Media with Linked Open Data, Entity Recognition, and Event Extr...
 
Determining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in DiscourseDetermining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in Discourse
 
Microblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracyMicroblog-genre noise and its impact on semantic annotation accuracy
Microblog-genre noise and its impact on semantic annotation accuracy
 
Empirical Validation of Reichenbach’s Tense Framework
Empirical Validation of Reichenbach’s Tense FrameworkEmpirical Validation of Reichenbach’s Tense Framework
Empirical Validation of Reichenbach’s Tense Framework
 
Towards Context-Aware Search and Analysis on Social Media Data
Towards Context-Aware Search and Analysis on Social Media DataTowards Context-Aware Search and Analysis on Social Media Data
Towards Context-Aware Search and Analysis on Social Media Data
 
Determining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in DiscourseDetermining the Types of Temporal Relations in Discourse
Determining the Types of Temporal Relations in Discourse
 

Recently uploaded

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 

Recently uploaded (20)

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 

TIMEN: An Open Temporal Expression Normalisation Resource

  • 1. TIMEN An Open Temporal Expression Normalisation Resource H.Llorens, L.Derczynski, R.Gaizauskas, E. Saquete
  • 2. Outline ● Introduction: Timex normalisation ● Related work ● Problem: reinventing the wheel once and again ● Proposal: TIMEN ● Evaluation ● Conclusions ● Further Work
  • 3. Timex Normalisation Temporal information extraction subtask. Timex: linguistic expression of a time point or interval. Normalisation: semantic interpretation of timexes. Temporal Expression (TIMEX) Timex normalization Linguistics/Variability/Relativity ISO 8601/Invariable interpretation June 2012, next month, 06/2012 2012-06 this morning 7 a.m. 2012-05-24T07:00 3 days and 3 hours PT3D3H weekly XXXX-XX-WXX
  • 4. Timex Normalisation (II) Useful for a variety of NLP applications: IR, QA, Summarization, etc. I went to the cinema yesterday. event timex Value: 2012-05-23 When did he go to the cinema? 2012-05-23 The main advantage of normalisation is having timexes in standard time representations (e.g., gregorian calendar).
  • 5. Related Work There are many approaches to timex normalisation ● Pre TempEval-2 ○ TempEx (2000), GUTime (2005), Chronos (2004), TERSEO (2005), TimexTag (2005), TEA (2006), DANTE (2007)... ● TempEval-2 (2010) ○ HeidelTime, TRIPS/TRIOS, TIPSem/TIPSemB...
  • 6. Similarities and differences ● Approaches have slightly different architectures and show slightly different performances on tests. ● But all the approaches are rule-based and in general they use the same normalization strategies. ● & also require the same parameters to perform the task. ○ DCT: document creation time (deictic) (2 days ago: 2012-05-22) ○ Reference time: time talked about (anaphoric) (2 days before: 2012-05-20) ○ Tense: Resolution direction (October) Past (2011-10), Present/Future (2012-10)
  • 7. The problem Reinventing the wheel once and again ● Implementation of high-performance approaches is costly and it is done all the times from the scratch. ● all the approaches are similar: rule-based with similar normalization rules and strategies. ● none is meant to be reused and refined by others.
  • 8. Proposal: TIMEN Characteristics: ● Open philosophy: meant to be reused and refined (even across languages) ● Not only meant for computer scientists: ○ the algorithms (source code) and normalisation rules (db of user- friendly rules with a documented syntax) are separated. ● Independent from other timex processing tasks ● Multi-platform and easy integration
  • 9. TIMEN Library Architecture Example: timex: three days ago DCT:2012-05-24 normtext: 3_day_ago pattern: Num_TUnit_ago only 1 rule matches. normalized value: 2012-05-21 Example2: timex: October 20 2 rules matching disambiguation 20 probably a day rather than a year because <32
  • 10. Rule base sample (English)
  • 12. TIMEN community ● Open-source software: http://code.google.com/p/timen/ ● Crowd extension of the rule set (interactive web interface to upload and check new rules): http//timen.org * new rules only accepted if they improve the performance on the current dataset or new examples (human reviewed). Eg: New Year's Eve
  • 13. Evaluation Experiments: ● Normalization accuracy of TIMEN ● Performance gain in s-o-a approaches by integrating TIMEN Datasets: ● TempEval-2 test-set (already known for approaches, mainly common dates and duration) ● TimenEval dataset (new, unknown for appr., balanced among different timex types)
  • 14. Normalisation accuracy gold timexes normalisation yesterday 2012-05-23 2012 correct 2012 correct October 2012-10 daily incorrect TIMEN xxxx-xx-xx correct morning 2011 incorrect ... ... ... e.g. TOTAL: 100 timexes to normalise e.g. TOTAL: 90 correct normalizations RESULT: 90/100 --> 90% ACCURACY
  • 15. Normalisation accuracy TEST SET NORMALISAION ACC TempEval-2 0.90 TimenEval 0.68 ● TIMEN shows a high performance even in this first version (only 76 rules). ● TimenEval accuracy is lower. This corpus is more heterogeneous (times/sets) and normalization is more difficult.
  • 16. Performance gain built-in Original normalisation Approach X normalisation of Approach X recognized timexes New TIMEN normalisation Performance gain = New accuracy - Original accuracy
  • 17. Performance gain (TempEval-2) "known data" System built-in norm. TIMEN norm. Err. Redution TIPSemB 0.83 0.89 35% HeidelTime 0.94 0.94 0% TERNIP 0.76 0.92 66% ● Replacing built-in normalization approaches of the systems by TIMEN generally improves their performance in TE2 testset. ● Tested (current) versions of the systems may have been developed/updated being aware of this data. What does it happen with data which is new for them?
  • 18. Performance gain (TimenEval) "new data" System built-in norm. TIMEN norm. Err. Redution TIPSemB 0.57 0.67 23% HeidelTime 0.72 0.74 7% TERNIP 0.70 0.72 66% ● Using new data, the built-in approaches performance decreases in general. ● TIMEN favours the normalization performance for all the systems.
  • 19. Conclusions ● We presented an open tool for timex normalisation: TIMEN. ● ADVANTAGES: ○ High performance (above recent approaches). ○ Easily integrated in any timex recognition approach. ○ Can be improved by the community (open philosophy), and avoids re-development from scratch. ○ Available: http://timen.org and Google code
  • 20. Further Work ● Community-based extension and refinement of TIMEN (rulebase). ● Extensive evaluation of TIMEN in various languages (Spanish, Chinese, Italian and Danish).
  • 21. TIMEN: An Open TIMEX Normalisation Resource THANK YOU! QUESTIONS? http://timen.org H.Llorens, L.Derczynski, R.Gaizauskas, E. Saquete