The document discusses Michele Filannino's final presentation on identifying temporal expressions in biomedical texts. It provides context on natural language processing and information extraction. It then defines temporal expressions and discusses their importance for tasks like question answering and summarization. The presentation outlines the forms temporal expressions can take, common annotation and normalization methods used, and gives an example. It also notes the lack of freely available corpora and describes Filannino's contributions of building the first freely available timex corpus and a temporal expression normalizer. The presentation concludes with discussing some examples of human annotation mistakes and Filannino's remaining to-do items.
Human brain has evolved to master, among the others, the capacity of extracting flows of events out of a speech or a written text. This temporal sense, mainly unconscious, allows us to summarise, organise, remember and combine different pieces of information working out new insights and discoveries. The temporal dimension is an inescapable and easy truth for us, but enabling machines to fully deal with time is a challenging task. Computers are still incapable of detecting temporal incompatibilities, summarising workflows or identifying causes and consequences of facts. My research wants to answer the following questions: Can computers understand time? And what possibilities will that unlock?
This internal presentation discusses Michele Filannino's research taster project on temporal expressions extraction. The project is part of Michele's four-year PhD through the CDT program, which includes a six-month foundation period with courses and a short taster project. Michele's taster project focuses on extracting temporal expressions from text, such as dates, times, durations, and frequencies, which can improve applications like question answering and summarization. The presentation covers challenges like the scarcity of annotated corpora, different annotation standards, and the vibrant research in extracting temporal expressions from clinical text.
The document summarizes the SWAP research group meeting on April 26, 2010. It outlines the SWOP semantic web service platform and the META multi-language text analyzer. SWOP allows annotating services with natural language descriptions and discovering them through UDDI. META performs analyses on multi-language texts and has a network interface, web interface, and web service interface implemented with Apache Tomcat and Axis2.
Human brain has evolved to master, among the others, the capacity of extracting flows of events out of a speech or a written text. This temporal sense, mainly unconscious, allows us to summarise, organise, remember and combine different pieces of information working out new insights and discoveries. The temporal dimension is an inescapable and easy truth for us, but enabling machines to fully deal with time is a challenging task. Computers are still incapable of detecting temporal incompatibilities, summarising workflows or identifying causes and consequences of facts. My research wants to answer the following questions: Can computers understand time? And what possibilities will that unlock?
This internal presentation discusses Michele Filannino's research taster project on temporal expressions extraction. The project is part of Michele's four-year PhD through the CDT program, which includes a six-month foundation period with courses and a short taster project. Michele's taster project focuses on extracting temporal expressions from text, such as dates, times, durations, and frequencies, which can improve applications like question answering and summarization. The presentation covers challenges like the scarcity of annotated corpora, different annotation standards, and the vibrant research in extracting temporal expressions from clinical text.
The document summarizes the SWAP research group meeting on April 26, 2010. It outlines the SWOP semantic web service platform and the META multi-language text analyzer. SWOP allows annotating services with natural language descriptions and discovering them through UDDI. META performs analyses on multi-language texts and has a network interface, web interface, and web service interface implemented with Apache Tomcat and Axis2.
Using machine learning to predict temporal orientation of search engines’ que...Michele Filannino
The document describes a presentation on predicting the temporal orientation of search engine queries using machine learning. It discusses running queries through various models with different feature sets to classify the queries as having past, future, recency, or atemporal intent. The minimal model using fewer features achieved 61.33% accuracy on the test data, while an intermediate model had 66.33% accuracy and a full model using more features and random forests had 55% accuracy. Further analysis found room for improvement by optimizing the feature selection.
Temporal information extraction in the general and clinical domainMichele Filannino
This document summarizes a research symposium presentation on temporal information extraction. The presentation discusses extracting temporal information from text, including identifying temporal expressions like dates and durations, events, and links between them. It presents an example extraction and proposes a machine learning approach using conditional random fields. Evaluation results on benchmark tasks and potential applications in clinical narratives and predicting the temporal intent of queries are also mentioned.
Discovery of temporal information is key for organising knowledge and therefore the task of extracting and representing temporal information from texts has received an increasing interest. In this paper we focus on the discovery of temporal footprints from encyclopaedic descriptions. Temporal footprints are time-line periods that are associated to the existence of specific concepts. Our approach relies on the extraction of date mentions and prediction of lower and upper bound- aries that define temporal footprints. We report on several experiments on persons’ pages from Wikipedia in order to illustrate the feasibility of the proposed methods.
Nonlinear component analysis as a kernel eigenvalue problemMichele Filannino
This presentation summarizes paper #7 titled "Nonlinear component analysis as a kernel eigenvalue problem" by Scholkopf, Smola, and Muller. It introduces Kernel Principal Component Analysis (KPCA) as an extension of PCA that maps data into a higher dimensional feature space. The presentation discusses how KPCA frames PCA as a kernel eigenvalue problem and computes principal components in this new feature space. It provides the mathematical formulation and algorithm for KPCA. The presentation also discusses applications, advantages, disadvantages, and experiments comparing KPCA to other dimensionality reduction techniques.
Algoritmo di text-similarity per l'annotazione semantica di Web ServiceMichele Filannino
The document discusses an algorithm for measuring text similarity called SAWA. It describes how SAWA calculates word-to-word and text-to-text similarity using Wikipedia as a concept hierarchy. Experimental results showed that optimizations improved performance by 10 times while maintaining quality results. Future work includes developing web service and web interfaces and releasing the source code as open-source.
The document discusses serendipity and its applications in computer science and information filtering. It proposes an architecture for a serendipity module that uses an inverted user profile to search for less similar recommendations and promote discovery. The module would select random but poorly similar items to support, not replace, typical recommendations. Upcoming developments include analogy-based recommendations and adaptive algorithms based on user tasks.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Using machine learning to predict temporal orientation of search engines’ que...Michele Filannino
The document describes a presentation on predicting the temporal orientation of search engine queries using machine learning. It discusses running queries through various models with different feature sets to classify the queries as having past, future, recency, or atemporal intent. The minimal model using fewer features achieved 61.33% accuracy on the test data, while an intermediate model had 66.33% accuracy and a full model using more features and random forests had 55% accuracy. Further analysis found room for improvement by optimizing the feature selection.
Temporal information extraction in the general and clinical domainMichele Filannino
This document summarizes a research symposium presentation on temporal information extraction. The presentation discusses extracting temporal information from text, including identifying temporal expressions like dates and durations, events, and links between them. It presents an example extraction and proposes a machine learning approach using conditional random fields. Evaluation results on benchmark tasks and potential applications in clinical narratives and predicting the temporal intent of queries are also mentioned.
Discovery of temporal information is key for organising knowledge and therefore the task of extracting and representing temporal information from texts has received an increasing interest. In this paper we focus on the discovery of temporal footprints from encyclopaedic descriptions. Temporal footprints are time-line periods that are associated to the existence of specific concepts. Our approach relies on the extraction of date mentions and prediction of lower and upper bound- aries that define temporal footprints. We report on several experiments on persons’ pages from Wikipedia in order to illustrate the feasibility of the proposed methods.
Nonlinear component analysis as a kernel eigenvalue problemMichele Filannino
This presentation summarizes paper #7 titled "Nonlinear component analysis as a kernel eigenvalue problem" by Scholkopf, Smola, and Muller. It introduces Kernel Principal Component Analysis (KPCA) as an extension of PCA that maps data into a higher dimensional feature space. The presentation discusses how KPCA frames PCA as a kernel eigenvalue problem and computes principal components in this new feature space. It provides the mathematical formulation and algorithm for KPCA. The presentation also discusses applications, advantages, disadvantages, and experiments comparing KPCA to other dimensionality reduction techniques.
Algoritmo di text-similarity per l'annotazione semantica di Web ServiceMichele Filannino
The document discusses an algorithm for measuring text similarity called SAWA. It describes how SAWA calculates word-to-word and text-to-text similarity using Wikipedia as a concept hierarchy. Experimental results showed that optimizations improved performance by 10 times while maintaining quality results. Future work includes developing web service and web interfaces and releasing the source code as open-source.
The document discusses serendipity and its applications in computer science and information filtering. It proposes an architecture for a serendipity module that uses an inverted user profile to search for less similar recommendations and promote discovery. The module would select random but poorly similar items to support, not replace, typical recommendations. Upcoming developments include analogy-based recommendations and adaptive algorithms based on user tasks.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
2. presentation temporal expressions
where we are
■ Computer science
● natural language processing
▶ information extraction
★ temporal expressions extraction
29/02/2012, Michele Filannino 2 / 23
3. presentation temporal expressions
temporal expression definition
■ natural language phrase that denotes a temporal
entity: an interval, or an instant (Ferro et Al.)1
● She has been at work for more than a month
● He wrapped up a three-hour meeting with the Iraqi
president in Baghdad today.
1 L.
Ferro, I. Mani, B. Sundheim, and G. Wilson, “Tides temporal annotation
guidelines, v. 1.0.2,” MITRE, 2001
29/02/2012, Michele Filannino 3 / 23
4. presentation temporal expressions
why?
■ user’s perspective
● temporal aspects of events and entities provide a
natural mechanism for organising information.
■ machine’s perspective
● improvements in
▶ question answering, summarisation, browsing
29/02/2012, Michele Filannino 4 / 23
7. presentation temporal expressions
temporal forms 1
■ time or date references
● 11pm, February 14th, 2005
■ time references that anchor on another time
● one hour after midnight, two weeks before Christmas
■ durations
● few months, two days, five years
■ recurring times
● every third month, twice in the hour
1 J.
Poveda, M. Surdeanu, and J. Turmo, “An analysis of Bootstrapping for the
Recognition of Temporal Expressions”, 2009
29/02/2012, Michele Filannino 7 / 23
8. presentation temporal expressions
temporal forms 1
■ context-dependent times
● today, last year
■ vague references
● somewhere in the middle of June, the near future
■ times indicated by an event
● the day S. Berlusconi resigned
▶ an event is considered a cover term for situations that
happen or occur
1 J.
Poveda, M. Surdeanu, and J. Turmo, “An analysis of Bootstrapping for the
Recognition of Temporal Expressions”, 2009
29/02/2012, Michele Filannino 8 / 23
9. presentation temporal expressions
methodology
■ annotation
● recognition
▶ automatically detect and delimitate expressions
▶ mostly machine-learning techniques
● normalisation
▶ assign attributes values for all the recognised
expressions
▶ using a shared and formal format
▶ mostly rule-based techniques
■ reasoning or searching
29/02/2012, Michele Filannino 9 / 23
10. presentation temporal expressions
example: raw text
That means Unisys must pay about $100 million in interest every
quarter, on top of $27 million in dividends on preferred stock.
Source: TRIOS TimeBank v.0.1 29/02/2012, Michele Filannino 10 / 23
11. presentation temporal expressions
example: recognition
That means Unisys must <ev>pay</ev> about $100 million in
interest <te>every quarter</te>, on top of $27 million in
dividends on preferred stock.
Source: TRIOS TimeBank v.0.1 29/02/2012, Michele Filannino 11 / 23
12. presentation temporal expressions
example: normalisation
That means Unisys must <EVENT eid="e110" ...>pay</EVENT>
about $100 million in interest <TIMEX3 tid="t256" type="SET"
value="P1Q" temporalFunction="false"
functionInDocument="NONE" quant="every">every quarter</
TIMEX3>, on top of $27 million in dividends on preferred stock.
<TLINK lid="l32" relType="BEFORE" relatedToEvent="e110"
eventID="e107"/>
<TLINK lid="l26" relType="OVERLAP" eventID="e110"
relatedToTime="t256"/>
Source: TRIOS TimeBank v.0.1 29/02/2012, Michele Filannino 12 / 23
14. presentation temporal expressions
my contributions
■ built the first timex corpus using all the possible
freely available timexes
● {timex, type, normalised_value, utterance_reference}
● 2822 different timexes
■ built a normaliser
● as TRIOS’ extension (University of Rochester)
● 71.66% accuracy from 62.57%
29/02/2012, Michele Filannino 14 / 23
15. presentation temporal expressions
human mistakes
utterance expression type annotation
- three years before DATE FUTURE_REF
26/09/2011 this morning DATE 1998-02-06TMO
- two decades DURATION P20Y
- the summer of 1862 DATE FUTURE_REF
- centuries DURATION PXE
- the last half of ‘80s DATE 198
29/02/2012, Michele Filannino 15 / 23
16. presentation temporal expressions
my to-do list
✓ study the literature
✓ build a corpus of timexes
✓ build a normaliser
■ release my timexes corpus freely
■ literature review
22 days elapsed 8 days remaining
0 3 6 9 12 15 18 21 24 27 30
29/02/2012, Michele Filannino 16 / 23