This document summarizes three temporal reasoning datasets: MCTACO, TORQUE, and TRACIE.
[1] MCTACO is a multiple choice dataset for temporal commonsense understanding with 13k question-answer pairs about the duration, order, typical time, frequency, and stationarity of events. It was created using crowdsourcing.
[2] TORQUE is a reading comprehension dataset with over 20k temporal ordering questions about events in text. It uses natural language to annotate relationships between events, addressing limitations of prior work. The questions were generated and answered through crowdsourcing.
[3] TRACIE focuses on implicit events and uses distant supervision to generate temporal relation instances between
This document discusses fuzzy logic, beginning with its origins in ancient Greece and formalization in 1965 by Lotfi Zadeh. It explains fuzzy logic represents concepts with overlapping membership functions rather than binary logic. Fuzzy logic and neural networks both model human reasoning but fuzzy logic uses linguistic rules while neural networks learn from examples. Fuzzy logic has applications in control systems like temperature controllers and anti-lock braking systems to handle nonlinear dynamics. It is used in other fields like business and expert systems to represent subjective concepts.
Quantum computing provides an alternative computational model based on quantum mechanics. It utilizes quantum phenomena such as superposition and entanglement to perform computations using quantum logic gates on qubits. This allows quantum computers to potentially solve certain problems exponentially faster than classical computers. However, building large-scale quantum computers remains a challenge. In the meantime, smaller quantum systems are being developed and quantum algorithms are being experimentally tested on these devices. Researchers are also working on methods to efficiently simulate quantum computations on classical computers.
This document discusses 3D graphics and transformations. It begins by introducing the goals of 3D graphics as producing 2D images from a mathematically described 3D environment. It then covers coordinate systems, affine transformations like translation, rotation, and scaling, and how they are represented by matrices. Homogeneous coordinates are introduced to represent transformations uniformly with matrices. Quaternions are also mentioned as an alternative to rotation matrices. The document provides examples of 3D translation, rotation, and issues around representing rotations.
The document discusses quantum computing and its potential impacts. It notes that current quantum computers have around 50-70 qubits, which is small compared to classical computers, and errors still need to be addressed. Quantum computers may achieve "quantum supremacy" by solving problems that classical computers cannot. One potential impact area is cryptography - most public-key encryption relies on problems like factoring or discrete logs, which can be broken by Shor's algorithm on a large quantum computer. This is not an imminent threat but could affect secure documents stored now. Post-quantum cryptography aims to base encryption on alternative hard problems not vulnerable to quantum attacks.
This document discusses quantum channels, which are the generalization of unitary maps to mixed states. Quantum channels can be represented using Kraus operators or a process matrix. State preparation and measurement can both be modeled as quantum channels, with state preparation having a classical input space and measurement having a classical output space. More general measurements correspond to partial measurements that leave some quantum information remaining after the measurement. The Stinespring dilation theorem provides an axiomatic characterization of quantum channels in terms of an isometric embedding into a larger space.
The document discusses real-time systems and real-time operating systems (RTOS). It describes the types of real-time systems as hard, soft, and firm real-time. It also discusses the features expected of an RTOS, including strictly enforced priorities, low latency, and priority inheritance. The document then covers approaches to making Linux and other general purpose operating systems capable of real-time performance, including the real-time kernel patch and co-kernel approaches like Xenomai.
This document discusses fuzzy logic, beginning with its origins in ancient Greece and formalization in 1965 by Lotfi Zadeh. It explains fuzzy logic represents concepts with overlapping membership functions rather than binary logic. Fuzzy logic and neural networks both model human reasoning but fuzzy logic uses linguistic rules while neural networks learn from examples. Fuzzy logic has applications in control systems like temperature controllers and anti-lock braking systems to handle nonlinear dynamics. It is used in other fields like business and expert systems to represent subjective concepts.
Quantum computing provides an alternative computational model based on quantum mechanics. It utilizes quantum phenomena such as superposition and entanglement to perform computations using quantum logic gates on qubits. This allows quantum computers to potentially solve certain problems exponentially faster than classical computers. However, building large-scale quantum computers remains a challenge. In the meantime, smaller quantum systems are being developed and quantum algorithms are being experimentally tested on these devices. Researchers are also working on methods to efficiently simulate quantum computations on classical computers.
This document discusses 3D graphics and transformations. It begins by introducing the goals of 3D graphics as producing 2D images from a mathematically described 3D environment. It then covers coordinate systems, affine transformations like translation, rotation, and scaling, and how they are represented by matrices. Homogeneous coordinates are introduced to represent transformations uniformly with matrices. Quaternions are also mentioned as an alternative to rotation matrices. The document provides examples of 3D translation, rotation, and issues around representing rotations.
The document discusses quantum computing and its potential impacts. It notes that current quantum computers have around 50-70 qubits, which is small compared to classical computers, and errors still need to be addressed. Quantum computers may achieve "quantum supremacy" by solving problems that classical computers cannot. One potential impact area is cryptography - most public-key encryption relies on problems like factoring or discrete logs, which can be broken by Shor's algorithm on a large quantum computer. This is not an imminent threat but could affect secure documents stored now. Post-quantum cryptography aims to base encryption on alternative hard problems not vulnerable to quantum attacks.
This document discusses quantum channels, which are the generalization of unitary maps to mixed states. Quantum channels can be represented using Kraus operators or a process matrix. State preparation and measurement can both be modeled as quantum channels, with state preparation having a classical input space and measurement having a classical output space. More general measurements correspond to partial measurements that leave some quantum information remaining after the measurement. The Stinespring dilation theorem provides an axiomatic characterization of quantum channels in terms of an isometric embedding into a larger space.
The document discusses real-time systems and real-time operating systems (RTOS). It describes the types of real-time systems as hard, soft, and firm real-time. It also discusses the features expected of an RTOS, including strictly enforced priorities, low latency, and priority inheritance. The document then covers approaches to making Linux and other general purpose operating systems capable of real-time performance, including the real-time kernel patch and co-kernel approaches like Xenomai.
Basic description about how to do a Space Project, based on experiences with XaTcobeo cubesat, a University of Vigo project for ESA education, with the help of INTA.
License: Breogan Costa, University of Vigo, CERN, JINR.
This document discusses various methods for defuzzification, which is the process of converting a fuzzy quantity into a crisp quantity. It describes seven common defuzzification methods: 1) max membership principle, 2) centroid method, 3) weighted average method, 4) mean max membership, 5) center of sums, 6) centre of largest area, and 7) first of maxima, last of maxima. For each method, it provides details on the calculation approach and formulas used to determine the defuzzified crisp value. The centroid method is noted as the most commonly used defuzzification technique.
The document discusses the relationship between pixels in an image, including pixel neighborhoods and connectivity. It defines different types of pixel neighborhoods - the 4 nearest neighbors, 8 nearest neighbors including diagonals, and boundary pixels that have fewer than 8 neighbors. Connectivity refers to whether two pixels are adjacent or connected based on their intensity values and neighborhood relationships. Specifically, it describes 4-connectivity, 8-connectivity, and m-connectivity. Regions in an image are sets of connected pixels, while boundaries separate adjacent regions.
Vector Quantization Vs Scalar Quantization ManasiKaur
Vector quantization has several advantages over scalar quantization for data compression:
1) Vector quantization groups input symbols into vectors and processes them together, while scalar quantization treats each symbol separately, reducing efficiency.
2) Vector quantization increases quantizer optimality and provides more flexibility for modification compared to scalar quantization.
3) Vector quantization can lower average distortion for the same number of reconstruction levels, or increase reconstruction levels for the same distortion, which scalar quantization cannot do.
Computer Networks Module 1 - part 2.pdfShanthalaKV
18CS52 VTU Computer Network & Security
MODULE 1-Part 2
DNS; The Internet's Directory Service: Services Provided by DNS, Overview of How DNS Works, DNS Records and Messages, Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables, Socket Programming: creating Network Applications: Socket Programming with UDP, Socket Programming with TCP.
This document discusses image processing in the frequency domain using the Fourier transform. It explains that image enhancement can be performed by designing a transfer function in the frequency domain and multiplying it with the image's Fourier transform. Filtering an image corresponds to multiplying its Fourier transform by a filter transfer function. Common filters discussed include low-pass filters for smoothing and high-pass filters for sharpening. Ideal filters have abrupt cutoffs which cause ringing artifacts, while Butterworth and Gaussian filters provide smoother responses.
The document discusses amortized analysis, which averages the time required to perform a sequence of operations over all operations. It describes three methods of amortized analysis: aggregate analysis, accounting analysis, and potential analysis. As an example, it analyzes the amortized cost of operations on a dynamic table using these three methods and shows that the amortized cost of insertion and deletion is O(1), even though some operations may have higher actual costs when triggering expansions or contractions of the table.
This document discusses quantum computers and their advantages over classical computers. It begins by describing classical computers and how they operate based on classical mechanics. It then defines quantum computers, explaining that they use quantum bits that can exist in superposition and entanglement, allowing them to perform multiple computations simultaneously. The document outlines the history of quantum computing development. It also discusses quantum computing concepts like superposition and entanglement. Finally, it reviews some applications of quantum computing like simulation, encryption, and factorization algorithms.
Fuzzy logic is an approach to artificial intelligence that allows for intermediate possibilities between binary values like yes and no. It imitates how humans make decisions with uncertainty. Fuzzy logic can be implemented in both hardware and software across systems of varying sizes, from small microcontrollers to large systems. It is useful for commercial and practical applications to control machines and products. While it may not provide perfectly accurate reasoning, it can provide acceptable reasoning and help deal with uncertainty in engineering problems.
DiscoRank: optimizing discoverability on SoundCloudAmélie Anglade
These are the slides of the presentation I gave at the Realtime Conf EU on 23rd April 2013.
The full abstract of the talk can be found here: http://lanyrd.com/2013/realtime-conf-europe/scdtyf/
The document provides an overview of fuzzy logic concepts including types of fuzzy systems, membership functions, fuzzy inference, fuzzification and defuzzification methods. It discusses knowledge-based and rule-based fuzzy systems, types of membership functions like triangular, trapezoidal and Gaussian. Examples of fuzzy logic applications in autonomous driving cars and methods for defuzzification like weighted average, centroid, max-membership and centre of sums are also summarized.
This document presents information on Hopfield networks through a slideshow presentation. It begins with an introduction to Hopfield networks, describing them as fully connected, single layer neural networks that can perform pattern recognition. It then discusses the properties of Hopfield networks, including their symmetric weights and binary neuron outputs. The document proceeds to provide derivations of the Hopfield network model based on an additive neuron model. It concludes by discussing applications of Hopfield networks.
This technology currently holding the most important position in the new ear. People can do so many thing using BCI technology, basically it was developed for Paralized people and tried on a monkey to test. There is a lot more to know from the slide about basic.
This slide is prepared by our team mate: Shourav Das
The document discusses efficient codebook design for image compression using vector quantization. It introduces data compression techniques, including lossless compression methods like dictionary coders and entropy coding, as well as lossy compression methods like scalar and vector quantization. Vector quantization maps vectors to codewords in a codebook to compress data. The LBG algorithm is described for generating an optimal codebook by iteratively clustering vectors and updating codebook centroids.
Time: Structural alignment and retrospective duration estimatesJohn Dennis
By psychological point of view, the concept of time estimation is presented, we also discuss some perspective and problem on psychological research on time estimation.
Understanding which events are mentioned in unstructured natural language texts, and which relations connect them is a fundamental task for many applications in natural language processing (NLP), such as personalized news systems, question answering and summarization. A notably challenging problem related to event processing is recognizing the relations that hold between events, in particular temporal and causal relations. Having knowledge about such relations is necessary to build event timelines from text and could be useful for future event prediction, risk analysis and decision making support. While there has been some research on temporal relations, the aspect of causality between events from an NLP perspective has hardly been touched, even though it has a long-standing tradition in psychology and formal linguistic fields. We propose an annotation scheme to cover different types of causality between events, techniques for extracting such relations and an investigation into the connection between temporal and causal relations. The latter will be the focus of this thesis work because causality clearly has a temporal constraint. We claim that injecting this precondition may be beneficial for the recognition of both temporal and causal relations.
Knowledge representation events in Artificial Intelligence.pptxkitsenthilkumarcse
As of my last knowledge update in January 2022, here are some key events and trends related to knowledge representation in the field of artificial intelligence (AI):
Knowledge Graphs: Knowledge graphs became increasingly important for representing structured and interconnected knowledge. Major organizations and platforms, such as Google's Knowledge Graph, Facebook's Open Graph, and Wikidata, continued to expand their knowledge graph initiatives.
Semantic Web: The development of the Semantic Web continued to progress, with standards like RDF (Resource Description Framework) and OWL (Web Ontology Language) being used to structure and represent data on the web in a semantically meaningful way.
Ontologies and Industry Standards: Ontologies, which are formal representations of knowledge in specific domains, gained prominence in various fields. Industry-specific ontologies and standards, such as HL7 FHIR in healthcare, were developed and adopted to improve data interoperability.
AI and Knowledge-Based Systems: Knowledge representation remained a foundational component of AI systems, particularly in expert systems. These systems were used in various applications, including medical diagnosis, financial analysis, and troubleshooting.
Hybrid Models: Researchers explored hybrid models that combined symbolic AI (knowledge-based) with connectionist AI (neural networks) to leverage the strengths of both approaches. This approach aimed to address the limitations of purely symbolic or purely neural models.
AI in Chatbots and Virtual Assistants: Chatbots and virtual assistants, powered by knowledge representation and natural language processing, continued to advance, offering improved conversational capabilities and knowledge retrieval.
Knowledge Representation for Explainable AI: As the need for explainable AI grew, knowledge representation played a role in providing transparent and interpretable models. This was particularly important in domains where AI decisions had significant consequences, such as healthcare and finance.
Research in Commonsense Reasoning: Advancements were made in commonsense reasoning, with the goal of enabling AI systems to understand and reason about everyday human knowledge and context.
Knowledge Representation and COVID-19: During the COVID-19 pandemic, knowledge representation played a vital role in aggregating and organizing data related to the virus, treatments, and vaccine research.
AI and the Semantic Web: The integration of AI technologies with the Semantic Web aimed to make web data more semantically meaningful, enhancing search engines, recommendation systems, and data integration.
An Analysis of Causality between Events and its Relation to Temporal InformationParamita Mirza
In Proceedings of the 25th International Conference on Computational Linguistics
In this work we present an annotation framework to capture causality between events, inspired by TimeML, and a language resource covering both temporal and causal relations. This data set is then used to build an automatic extraction system for causal signals and causal links between given event pairs. The evaluation and analysis of the system’s performance provides an insight into explicit causality in text and the connection between temporal and causal relations.
The document discusses various techniques for semantic processing in natural language processing, including:
1) Semantic processing aims to understand the meaning of sentences by associating words with semantic markers to resolve lexical ambiguity and preferences for certain subject-verb combinations.
2) Semantic grammars encode semantic information in syntactic rules and are designed around key semantic concepts.
3) Case grammars analyze meaning by identifying noun phrases that fill semantic roles or "cases" like agent, object, location for verbs.
4) Compositional semantics provides a step-by-step mapping from syntactic structures to semantic interpretations based on a knowledge base.
The document discusses knowledge representation and reasoning in artificial intelligence. It covers the following key points in 3 sentences:
Intelligent agents should have the capacity for perceiving, representing knowledge, reasoning about what they know, and acting. Knowledge representation involves representing an understanding of the world, while reasoning involves inferring implications of what is known. Logic provides a way to represent and reason about knowledge through specifying a logical language with syntax, semantics, and inference rules.
Basic description about how to do a Space Project, based on experiences with XaTcobeo cubesat, a University of Vigo project for ESA education, with the help of INTA.
License: Breogan Costa, University of Vigo, CERN, JINR.
This document discusses various methods for defuzzification, which is the process of converting a fuzzy quantity into a crisp quantity. It describes seven common defuzzification methods: 1) max membership principle, 2) centroid method, 3) weighted average method, 4) mean max membership, 5) center of sums, 6) centre of largest area, and 7) first of maxima, last of maxima. For each method, it provides details on the calculation approach and formulas used to determine the defuzzified crisp value. The centroid method is noted as the most commonly used defuzzification technique.
The document discusses the relationship between pixels in an image, including pixel neighborhoods and connectivity. It defines different types of pixel neighborhoods - the 4 nearest neighbors, 8 nearest neighbors including diagonals, and boundary pixels that have fewer than 8 neighbors. Connectivity refers to whether two pixels are adjacent or connected based on their intensity values and neighborhood relationships. Specifically, it describes 4-connectivity, 8-connectivity, and m-connectivity. Regions in an image are sets of connected pixels, while boundaries separate adjacent regions.
Vector Quantization Vs Scalar Quantization ManasiKaur
Vector quantization has several advantages over scalar quantization for data compression:
1) Vector quantization groups input symbols into vectors and processes them together, while scalar quantization treats each symbol separately, reducing efficiency.
2) Vector quantization increases quantizer optimality and provides more flexibility for modification compared to scalar quantization.
3) Vector quantization can lower average distortion for the same number of reconstruction levels, or increase reconstruction levels for the same distortion, which scalar quantization cannot do.
Computer Networks Module 1 - part 2.pdfShanthalaKV
18CS52 VTU Computer Network & Security
MODULE 1-Part 2
DNS; The Internet's Directory Service: Services Provided by DNS, Overview of How DNS Works, DNS Records and Messages, Peer-to-Peer Applications: P2P File Distribution, Distributed Hash Tables, Socket Programming: creating Network Applications: Socket Programming with UDP, Socket Programming with TCP.
This document discusses image processing in the frequency domain using the Fourier transform. It explains that image enhancement can be performed by designing a transfer function in the frequency domain and multiplying it with the image's Fourier transform. Filtering an image corresponds to multiplying its Fourier transform by a filter transfer function. Common filters discussed include low-pass filters for smoothing and high-pass filters for sharpening. Ideal filters have abrupt cutoffs which cause ringing artifacts, while Butterworth and Gaussian filters provide smoother responses.
The document discusses amortized analysis, which averages the time required to perform a sequence of operations over all operations. It describes three methods of amortized analysis: aggregate analysis, accounting analysis, and potential analysis. As an example, it analyzes the amortized cost of operations on a dynamic table using these three methods and shows that the amortized cost of insertion and deletion is O(1), even though some operations may have higher actual costs when triggering expansions or contractions of the table.
This document discusses quantum computers and their advantages over classical computers. It begins by describing classical computers and how they operate based on classical mechanics. It then defines quantum computers, explaining that they use quantum bits that can exist in superposition and entanglement, allowing them to perform multiple computations simultaneously. The document outlines the history of quantum computing development. It also discusses quantum computing concepts like superposition and entanglement. Finally, it reviews some applications of quantum computing like simulation, encryption, and factorization algorithms.
Fuzzy logic is an approach to artificial intelligence that allows for intermediate possibilities between binary values like yes and no. It imitates how humans make decisions with uncertainty. Fuzzy logic can be implemented in both hardware and software across systems of varying sizes, from small microcontrollers to large systems. It is useful for commercial and practical applications to control machines and products. While it may not provide perfectly accurate reasoning, it can provide acceptable reasoning and help deal with uncertainty in engineering problems.
DiscoRank: optimizing discoverability on SoundCloudAmélie Anglade
These are the slides of the presentation I gave at the Realtime Conf EU on 23rd April 2013.
The full abstract of the talk can be found here: http://lanyrd.com/2013/realtime-conf-europe/scdtyf/
The document provides an overview of fuzzy logic concepts including types of fuzzy systems, membership functions, fuzzy inference, fuzzification and defuzzification methods. It discusses knowledge-based and rule-based fuzzy systems, types of membership functions like triangular, trapezoidal and Gaussian. Examples of fuzzy logic applications in autonomous driving cars and methods for defuzzification like weighted average, centroid, max-membership and centre of sums are also summarized.
This document presents information on Hopfield networks through a slideshow presentation. It begins with an introduction to Hopfield networks, describing them as fully connected, single layer neural networks that can perform pattern recognition. It then discusses the properties of Hopfield networks, including their symmetric weights and binary neuron outputs. The document proceeds to provide derivations of the Hopfield network model based on an additive neuron model. It concludes by discussing applications of Hopfield networks.
This technology currently holding the most important position in the new ear. People can do so many thing using BCI technology, basically it was developed for Paralized people and tried on a monkey to test. There is a lot more to know from the slide about basic.
This slide is prepared by our team mate: Shourav Das
The document discusses efficient codebook design for image compression using vector quantization. It introduces data compression techniques, including lossless compression methods like dictionary coders and entropy coding, as well as lossy compression methods like scalar and vector quantization. Vector quantization maps vectors to codewords in a codebook to compress data. The LBG algorithm is described for generating an optimal codebook by iteratively clustering vectors and updating codebook centroids.
Time: Structural alignment and retrospective duration estimatesJohn Dennis
By psychological point of view, the concept of time estimation is presented, we also discuss some perspective and problem on psychological research on time estimation.
Understanding which events are mentioned in unstructured natural language texts, and which relations connect them is a fundamental task for many applications in natural language processing (NLP), such as personalized news systems, question answering and summarization. A notably challenging problem related to event processing is recognizing the relations that hold between events, in particular temporal and causal relations. Having knowledge about such relations is necessary to build event timelines from text and could be useful for future event prediction, risk analysis and decision making support. While there has been some research on temporal relations, the aspect of causality between events from an NLP perspective has hardly been touched, even though it has a long-standing tradition in psychology and formal linguistic fields. We propose an annotation scheme to cover different types of causality between events, techniques for extracting such relations and an investigation into the connection between temporal and causal relations. The latter will be the focus of this thesis work because causality clearly has a temporal constraint. We claim that injecting this precondition may be beneficial for the recognition of both temporal and causal relations.
Knowledge representation events in Artificial Intelligence.pptxkitsenthilkumarcse
As of my last knowledge update in January 2022, here are some key events and trends related to knowledge representation in the field of artificial intelligence (AI):
Knowledge Graphs: Knowledge graphs became increasingly important for representing structured and interconnected knowledge. Major organizations and platforms, such as Google's Knowledge Graph, Facebook's Open Graph, and Wikidata, continued to expand their knowledge graph initiatives.
Semantic Web: The development of the Semantic Web continued to progress, with standards like RDF (Resource Description Framework) and OWL (Web Ontology Language) being used to structure and represent data on the web in a semantically meaningful way.
Ontologies and Industry Standards: Ontologies, which are formal representations of knowledge in specific domains, gained prominence in various fields. Industry-specific ontologies and standards, such as HL7 FHIR in healthcare, were developed and adopted to improve data interoperability.
AI and Knowledge-Based Systems: Knowledge representation remained a foundational component of AI systems, particularly in expert systems. These systems were used in various applications, including medical diagnosis, financial analysis, and troubleshooting.
Hybrid Models: Researchers explored hybrid models that combined symbolic AI (knowledge-based) with connectionist AI (neural networks) to leverage the strengths of both approaches. This approach aimed to address the limitations of purely symbolic or purely neural models.
AI in Chatbots and Virtual Assistants: Chatbots and virtual assistants, powered by knowledge representation and natural language processing, continued to advance, offering improved conversational capabilities and knowledge retrieval.
Knowledge Representation for Explainable AI: As the need for explainable AI grew, knowledge representation played a role in providing transparent and interpretable models. This was particularly important in domains where AI decisions had significant consequences, such as healthcare and finance.
Research in Commonsense Reasoning: Advancements were made in commonsense reasoning, with the goal of enabling AI systems to understand and reason about everyday human knowledge and context.
Knowledge Representation and COVID-19: During the COVID-19 pandemic, knowledge representation played a vital role in aggregating and organizing data related to the virus, treatments, and vaccine research.
AI and the Semantic Web: The integration of AI technologies with the Semantic Web aimed to make web data more semantically meaningful, enhancing search engines, recommendation systems, and data integration.
An Analysis of Causality between Events and its Relation to Temporal InformationParamita Mirza
In Proceedings of the 25th International Conference on Computational Linguistics
In this work we present an annotation framework to capture causality between events, inspired by TimeML, and a language resource covering both temporal and causal relations. This data set is then used to build an automatic extraction system for causal signals and causal links between given event pairs. The evaluation and analysis of the system’s performance provides an insight into explicit causality in text and the connection between temporal and causal relations.
The document discusses various techniques for semantic processing in natural language processing, including:
1) Semantic processing aims to understand the meaning of sentences by associating words with semantic markers to resolve lexical ambiguity and preferences for certain subject-verb combinations.
2) Semantic grammars encode semantic information in syntactic rules and are designed around key semantic concepts.
3) Case grammars analyze meaning by identifying noun phrases that fill semantic roles or "cases" like agent, object, location for verbs.
4) Compositional semantics provides a step-by-step mapping from syntactic structures to semantic interpretations based on a knowledge base.
The document discusses knowledge representation and reasoning in artificial intelligence. It covers the following key points in 3 sentences:
Intelligent agents should have the capacity for perceiving, representing knowledge, reasoning about what they know, and acting. Knowledge representation involves representing an understanding of the world, while reasoning involves inferring implications of what is known. Logic provides a way to represent and reason about knowledge through specifying a logical language with syntax, semantics, and inference rules.
NoSQL databases were created to solve scalability problems with SQL databases. It turns out these problems are profoundly connected with Einstein's theory of relativity (no, honestly), and understanding this illuminates the SQL/NoSQL divide in surprising ways.
This document provides an overview of natural language processing (NLP). It discusses how NLP systems have achieved shallow matching to understand language but still have fundamental limitations in deep understanding that requires context and linguistic structure. It also describes technologies like speech recognition, text-to-speech, question answering and machine translation. It notes that while text data may seem superficial, language is complex with many levels of structure and meaning. Corpus-based statistical methods are presented as one approach in NLP.
Time underlies many interesting human behaviors. Thus, the question of
how to represent time in connectionist models is very important. One
approach is to represent time implicitly by its effects on processing rather
than explicitly (as in a spatial representation). The current report develops
a proposal along these lines first described by Jordan (1986) which
involves the use of recurrent links in order to provide networks with a
dynamic memory. In this approach, hidden unit patterns are fed back to
themselves; the internal representations which develop thus reflect task
demands in the context of prior internal states. A set of simulations is
reported which range from relatively simple problems (temporal version
of XOR) to discovering syntactic/semantic features for words. The
networks are able to learn interesting internal representations which
incorporate task demands with memory demands; indeed, in this approach
the notion of memory is inextricably bound up with task processing. These
representations reveal a rich structure, which allows them to be highly
context-dependent while also expressing generalizations across classes of
items. These representations suggest a method for representing lexical
categories and the type/token distinction.
The Complexity of Data: Computer Simulation and “Everyday” Social ScienceEdmund Chattoe-Brown
Although the existence of various forms of complexity in social systems is now widely recognised, this approach to explanation faces two major challenges that turn out to be intimately connected. The first is the existing conflict in social science between “micro” and “macro” styles of social explanation. The second is the relationship of complexity to the kind of data routinely collected in social science. In order to be accepted, complexity approaches need simultaneously to dodge the first conflict while making much better use of existing forms of data.
The first part of the talk will provide an introduction to the simulation approach and a discussion of various concepts in complexity with reference to simulation as a distinctive theory-building tool and methodology. The second part of the talk will develop these ideas in more depth using simulations by the author as case studies.
Knowledge representation and reasoning (KR) is the field of artificial intelligence (AI) dedicated to representing information about the world in a form that a computer system can utilize to solve complex tasks such as diagnosing a medical condition or having a dialog in a natural language
The document describes using a Monte Carlo simulation to model the process of generating English names over multiple generations. It outlines the key processes in the model, including prepending nicknames, appending place names or occupations, shortening names by dropping syllables, and rejecting identical names. The simulation is run with adjustable parameters and the results are compared statistically to real name data to test how well the model fits, rather than requiring an exact match of individual names. The simulation is found to match some properties like name length distribution but not others like the frequency of names containing "smith". This highlights both the utility and limitations of the Monte Carlo approach for this problem.
Foundations of Knowledge Representation in Artificial Intelligence.pptxkitsenthilkumarcse
Knowledge representation in artificial intelligence (AI) is a fundamental concept that involves the process of structuring and encoding knowledge so that AI systems can understand, reason, and make decisions. Effective knowledge representation is essential for AI systems to model and work with complex real-world information. Here are some key aspects of knowledge representation in AI:
Symbolic Knowledge Representation: This approach uses symbols and rules to represent knowledge. It involves encoding information using symbols, predicates, and logical statements. Common formalisms include first-order logic and propositional logic. Symbolic representation is particularly suited for knowledge-based systems and expert systems.
Semantic Networks: In a semantic network, knowledge is represented using nodes and links to denote relationships between concepts. This form of representation is intuitive and is often used for organizing knowledge in a structured manner.
Frames and Ontologies: Frames and ontologies are used to represent knowledge by structuring information into frames or classes. Frames contain attributes and values, and they help in organizing and categorizing knowledge. Ontologies, such as OWL (Web Ontology Language), provide a more formal representation of knowledge for use in the semantic web and knowledge graphs.
Rule-Based Systems: Rule-based systems use a set of rules to represent and reason with knowledge. These rules can be encoded in the form of "if-then" statements, allowing AI systems to make decisions and draw inferences.
Fuzzy Logic: Fuzzy logic allows for the representation of uncertainty and vagueness in knowledge. It is particularly useful in situations where information is not black and white but falls within degrees of truth.
Bayesian Networks: Bayesian networks represent knowledge using probability distributions and conditional dependencies. They are valuable for modeling uncertain or probabilistic relationships in various domains, such as medical diagnosis and risk analysis.
Connectionist Models: Connectionist models, like neural networks, use distributed representations to encode knowledge. In these models, knowledge is spread across interconnected nodes (neurons), and learning occurs through the adjustment of connection weights. These networks are particularly effective in tasks such as pattern recognition and natural language processing.
Hybrid Approaches: Many AI systems use a combination of different knowledge representation techniques to address the complexities of real-world problems. For instance, combining symbolic representation with connectionist models is a common approach in modern AI.
The choice of knowledge representation method depends on the specific problem domain, the nature of the data, and the requirements of the AI system.
Debs 2010 context based computing tutorialOpher Etzion
The document discusses context-aware computing and its utilization in event-based systems. It provides an agenda for a tutorial that covers context in general, context categories including temporal, spatial, state and segmentation contexts, and context implementation in practice and event processing. The document also provides examples of different types of contexts and how context can be composed from multiple contexts.
The document discusses the need for ontologies that can better support linking and mapping between large, distributed databases on the semantic web. While OWL has been successful in some domains, it lacks expressivity for tasks like representing part-whole relations, temporal reasoning, and procedural knowledge. A new generation of ontology languages may need to relax requirements like decidability in order to more powerfully represent relationships that are important for data integration and discovery across multiple knowledge sources.
1) The medial temporal lobe (MTL), including the hippocampus and parahippocampal cortex, is engaged both when remembering past events and imagining future events.
2) A study found activity in these MTL regions when participants thought about personal past or future events that were near or distant in time.
3) Many MTL regions showed similar responses to temporal distance (near vs. distant) for both past and future events, though one region in the parahippocampal cortex differed between past and future.
Understanding narrative text is more than simple information extraction on a sentence-by-sentence basis. To comprehend the true meaning of a narrative requires determining the connections between the sentences and the effect of one event on other events. This story understanding process can be greatly enhanced by the use of event descriptor templates that begin with the basic journalistic questions of who, what, when, where, why, and how but that go beyond these simple basics to address more complex relationships: role playing, context, impact, causality, and interests. Previously, representing story narratives as knowledge representations has required intensive manual effort on the part of trained knowledge engineers to correctly encode the contents of stories into a knowledge base (KB). For large volumes of text, this becomes impractical, limiting the usefulness of KB-based systems in question-answering. This paper describes a means of automating the narrative representation process by using event descriptor templates to elicit critical narrative information to be encoded in a knowledge based system.
Learning from Time-to-Event Data from Online Learning Contexts Shalin Hai-Jew
Time-to-event analysis is a statistical analysis approach that enables time-based insights about student learning, such as, How long does it take before a learner makes a new acquaintance in an online course? A new friend? How long does it take before a learner achieves breakout capacity in a particular learning sequence? How long does it take for a learner to commit to a course? This digital poster session presents time-to-event analysis (aka “survival analysis”) from real LMS data and shows how this analysis is done. Terms related to time-to-event analysis will be introduced, and the assertability of extracted data is explored.
Time-to-event analysis, in its simplest form, enables the study of in-world phenomena which includes the time it takes to achieve a particular defined “event” (whether negative or positive, desirable or undesirable), and it includes the nuance of “censored” data (in-world records for which data about event achievement was not attained during the time period of the analysis). This presentation introduces “time-to-event analysis” (on IBM’s SPSS Statistics) as applied to online educational data.
20230419-LLaMA-Adapter_ Efficient Fine-tuning of Language Models with Zero-in...San Kim
1. LLaMA-Adapter is a new method for efficiently fine-tuning large language models with zero-init attention.
2. It freezes the pre-trained parameters of LLaMA and only learns lightweight adaption prompts, requiring much less computation than fully fine-tuning the large model.
3. Experimental results show that LLaMA-Adapter achieves comparable performance to fully fine-tuned models while being over 3 times faster to train.
Dictionary-assisted supervised contrastive learning (DASCL) is a method that leverages specialized dictionaries when fine-tuning pretrained language models. It combines cross-entropy loss with a supervised contrastive learning objective to improve classification performance, particularly in few-shot learning settings. Evaluations on tasks like sentiment analysis and abuse detection found that DASCL outperforms cross-entropy alone or supervised contrastive learning without dictionaries. Interpretability techniques like contrastive explanations can provide insights into why models make predictions by comparing predictions to alternative options.
LongT5_Efficient Text-toText Transformer for Long Sequences_san.pptxSan Kim
LongT5 is a new Transformer architecture and attention mechanism called TGlobal that allows scaling of both input length and model size. LongT5 achieves state-of-the-art results on various datasets including arXiv, PubMed, BigPatent, MediaSum, and TriviaQA. TGlobal mimics the local/global mechanism of ETC but can be used as a drop-in replacement for regular attention in models like T5.
The document presents a study that evaluates the quality of word representations learned by multimodal pre-trained transformers. It discusses prior work showing that human concepts are grounded in sensory experiences and the advantage of multimodal representations over text-only ones, particularly for concrete words. The study aims to intrinsically evaluate how well semantic representations from models like LXMERT and ViLBERT align with human intuitions by comparing word similarity to human judgments. It obtains static embeddings from contextualized representations to measure semantic similarity independent of context.
Compeition-Level Code Generation with AlphaCode.pptxSan Kim
AlphaCode is a system for competitive code generation that achieves top 54.3% performance on average in competitions with over 5,000 participants. It uses a large transformer model pre-trained on GitHub code and fine-tuned on a competitive programming dataset. During fine-tuning, it employs techniques like tempering and GOLD to focus on precision over recall. At test time, it generates a large number of samples, filters them based on example tests, and clusters similar programs to select submissions. Extensive evaluations on CodeContests and APPS benchmarks show AlphaCode's performance scales log-linearly with more samples and compute.
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tu...San Kim
1) The study measures the intrinsic dimensionality of various natural language tasks when fine-tuned on top of large pretrained language models. It finds that common NLP tasks can be learned with very few parameters, sometimes only a few hundred, indicating that pretraining provides an effective compression framework that minimizes the description length of downstream tasks.
2) As the number of parameters in the pretrained model increases, the intrinsic dimensionality of fine-tuning decreases, showing that more parameters lead to more efficient representations.
3) Models with lower intrinsic dimensions across tasks achieve better performance with higher accuracies and smaller generalization gaps, suggesting intrinsic dimensionality is correlated with generalization.
1. “Going on a vacation” tasks longer than “Going for a walk”: A Study of Temporal Commonsense Understanding (EMNLP 19) – MCTACO
2. TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions (EMNLP 20)
3. Temporal Reasoning on Implicit Events from Distant Supervision (NAACL 21)-TRACIE
Answering complex open domain questions with multi-hop dense retrievalSan Kim
The document summarizes a research paper on multi-hop dense retrieval for answering complex open-domain questions. The key points are:
1) Traditional information retrieval methods struggle with semantic matching for complex multi-hop questions that require aggregating information from multiple documents.
2) The proposed method iteratively encodes the question and previously retrieved documents to retrieve the next relevant documents through efficient maximum inner product search.
3) Experiments show the method can accurately discover document sequences to answer multi-hop questions from unstructured text, without linking information, and matches state-of-the-art results while being faster.
Measuring massive multitask language understandingSan Kim
a new test to measure a text model's multitask accuracy. The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more.
a paper review. This presentation introduces Abductive Commonsense Reasoning which is the published paper in ICLR 2020. In this paper, the authors use commonsense to generate plausible hypotheses. They generate new data set 'ART' and propose new models for 'aNLI', 'aNLG' using BERT, and GPT.
The document summarizes the ELECTRA model for pre-training text encoders. ELECTRA uses a more sample efficient approach of pre-training bidirectional transformers by replacing masked tokens with generated samples instead of masking. It pre-trains by having a generator model produce corrupted versions of input sequences, and a discriminator model distinguishes real sequences from generated ones. ELECTRA outperforms other pre-trained models on several tasks while being more sample efficient and not having the discrepancy issue of masked language models.
XLNET, RoBERTa, and Reformer are state-of-the-art language models. XLNET improves on BERT by capturing dependency between target pairs. RoBERTa further improves pre-training by removing the next sentence prediction objective, training longer sequences with bigger batches. Reformer introduces efficient attention and feedforward mechanisms like reversible layers and locality-sensitive hashing to process long sequences with less memory.
This slide introduces transformer-xl which is the base paper for xl-net. You can understand what is the major contribution of this paper using this slide. This slide also explains the transformer for comparing differences between transformer and transformer-xl. Happy NLP!
This document summarizes research on deep learning approaches for face recognition. It describes the DeepFace model from Facebook, which used a deep convolutional network trained on 4.4 million faces to achieve state-of-the-art accuracy on the Labeled Faces in the Wild (LFW) dataset. It also summarizes the DeepID2 and DeepID3 models from Chinese University of Hong Kong, which employed joint identification-verification training of convolutional networks and achieved performance comparable or superior to DeepFace on LFW. Evaluation metrics for face verification and identification tasks are also outlined.
The document discusses neural networks, generative adversarial networks, and image-to-image translation. It begins by explaining how neural networks learn through forward propagation, calculating loss, and using the loss to update weights via backpropagation. Generative adversarial networks are introduced as a game between a generator and discriminator, where the generator tries to fool the discriminator and vice versa. Image-to-image translation uses conditional GANs to translate images from one domain to another, such as maps to aerial photos.
This document summarizes machine learning frameworks and libraries, neural network structures, and the process of building and training a neural network model for image classification. It discusses TensorFlow and PyTorch frameworks, describes the structure of a convolutional neural network, and provides code to import datasets, define a model, train the model on GPUs, and test the model's accuracy.
This document discusses the process of backpropagation in neural networks. It begins with an example of forward propagation through a neural network with an input, hidden and output layer. It then introduces backpropagation, which uses the calculation of errors at the output to calculate gradients and update weights in order to minimize the overall error. The key steps are outlined, including calculating the error derivatives, weight updates proportional to the local gradient, and backpropagating error signals from the output through the hidden layers. Formulas for calculating each step of backpropagation are provided.
Deep learning study 1. this slide includes basic mathematical theorems for deep learning, such as Bayes's theorem, Bayesian inference, information theorem.
explain backpropagation with a simple example.
normally, we use cross-entropy as loss function.
and we set the activation function of the output layer as the logistic sigmoid. because we want to maximize (log) likelihood. (or minimize negative (log) likelihood), and we suppose that the function is a binomial distribution which is the maximum entropy function in two-class classification.
but in this example, we set the loss function (objective function or cost function) as sum of square, which is normally used in logistic regression, for simplifying the problem.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...Sérgio Sacani
Context. With a mass exceeding several 104 M⊙ and a rich and dense population of massive stars, supermassive young star clusters
represent the most massive star-forming environment that is dominated by the feedback from massive stars and gravitational interactions
among stars.
Aims. In this paper we present the Extended Westerlund 1 and 2 Open Clusters Survey (EWOCS) project, which aims to investigate
the influence of the starburst environment on the formation of stars and planets, and on the evolution of both low and high mass stars.
The primary targets of this project are Westerlund 1 and 2, the closest supermassive star clusters to the Sun.
Methods. The project is based primarily on recent observations conducted with the Chandra and JWST observatories. Specifically,
the Chandra survey of Westerlund 1 consists of 36 new ACIS-I observations, nearly co-pointed, for a total exposure time of 1 Msec.
Additionally, we included 8 archival Chandra/ACIS-S observations. This paper presents the resulting catalog of X-ray sources within
and around Westerlund 1. Sources were detected by combining various existing methods, and photon extraction and source validation
were carried out using the ACIS-Extract software.
Results. The EWOCS X-ray catalog comprises 5963 validated sources out of the 9420 initially provided to ACIS-Extract, reaching a
photon flux threshold of approximately 2 × 10−8 photons cm−2
s
−1
. The X-ray sources exhibit a highly concentrated spatial distribution,
with 1075 sources located within the central 1 arcmin. We have successfully detected X-ray emissions from 126 out of the 166 known
massive stars of the cluster, and we have collected over 71 000 photons from the magnetar CXO J164710.20-455217.
Current Ms word generated power point presentation covers major details about the micronuclei test. It's significance and assays to conduct it. It is used to detect the micronuclei formation inside the cells of nearly every multicellular organism. It's formation takes place during chromosomal sepration at metaphase.
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...AbdullaAlAsif1
The pygmy halfbeak Dermogenys colletei, is known for its viviparous nature, this presents an intriguing case of relatively low fecundity, raising questions about potential compensatory reproductive strategies employed by this species. Our study delves into the examination of fecundity and the Gonadosomatic Index (GSI) in the Pygmy Halfbeak, D. colletei (Meisner, 2001), an intriguing viviparous fish indigenous to Sarawak, Borneo. We hypothesize that the Pygmy halfbeak, D. colletei, may exhibit unique reproductive adaptations to offset its low fecundity, thus enhancing its survival and fitness. To address this, we conducted a comprehensive study utilizing 28 mature female specimens of D. colletei, carefully measuring fecundity and GSI to shed light on the reproductive adaptations of this species. Our findings reveal that D. colletei indeed exhibits low fecundity, with a mean of 16.76 ± 2.01, and a mean GSI of 12.83 ± 1.27, providing crucial insights into the reproductive mechanisms at play in this species. These results underscore the existence of unique reproductive strategies in D. colletei, enabling its adaptation and persistence in Borneo's diverse aquatic ecosystems, and call for further ecological research to elucidate these mechanisms. This study lends to a better understanding of viviparous fish in Borneo and contributes to the broader field of aquatic ecology, enhancing our knowledge of species adaptations to unique ecological challenges.
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Leonel Morgado
Current descriptions of immersive learning cases are often difficult or impossible to compare. This is due to a myriad of different options on what details to include, which aspects are relevant, and on the descriptive approaches employed. Also, these aspects often combine very specific details with more general guidelines or indicate intents and rationales without clarifying their implementation. In this paper we provide a method to describe immersive learning cases that is structured to enable comparisons, yet flexible enough to allow researchers and practitioners to decide which aspects to include. This method leverages a taxonomy that classifies educational aspects at three levels (uses, practices, and strategies) and then utilizes two frameworks, the Immersive Learning Brain and the Immersion Cube, to enable a structured description and interpretation of immersive learning cases. The method is then demonstrated on a published immersive learning case on training for wind turbine maintenance using virtual reality. Applying the method results in a structured artifact, the Immersive Learning Case Sheet, that tags the case with its proximal uses, practices, and strategies, and refines the free text case description to ensure that matching details are included. This contribution is thus a case description method in support of future comparative research of immersive learning cases. We then discuss how the resulting description and interpretation can be leveraged to change immersion learning cases, by enriching them (considering low-effort changes or additions) or innovating (exploring more challenging avenues of transformation). The method holds significant promise to support better-grounded research in immersive learning.
ESPP presentation to EU Waste Water Network, 4th June 2024 “EU policies driving nutrient removal and recycling
and the revised UWWTD (Urban Waste Water Treatment Directive)”
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
1. Temporal Reasoning Task
San Kim
2021.06.30
1. “Going on a vacation” tasks longer than “Going for a walk”: A Study of Temporal
Commonsense Understanding (EMNLP 19) – MCTACO
2. TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions (EMNLP 20)
3. Temporal Reasoning on Implicit Events from Distant Supervision (NAACL 21)-TRACIE
2. MCTACO (Multiple choice temporal common-sense)
Temporal commonsense
Given two events “going on a vacation” and “going for a walk,” most humans would know that a
vacation is typically longer and occurs less often than a walk, but it is still challenging for computers
to understand and reason about temporal commonsense.
5 temporal properties
• Duration (how long an event takes)
• Temporal ordering (typical order of events)
• Typical time (when an event happens)
• Frequency (how often an event occurs)
• Stationarity (whether a state holds for a very long time or
indefinitely)
3. MCTACO
• MCTACO is comprised of 13k tuples, in the form of (sentence, question, candidate answer).
• The sentences in those tuples are randomly selected from MultiRC
• Collect questions and candidate answers(both correct question and wrong ones) using AMT.
• To ensure the quality of the results, they limit the annotations to native speakers and use
qualification tryouts.
• Step1. Question Generation
• Should ask about one of the five temporal phenomena
the defined earlier
• Should not be solved simply by a word or phrase from
the original sentence
• They also require crowd-sourcers to provide a correct
answer for each of their questions(correct and
incorrect answers)
• Step2. Question verification
• The ask another two crowdsourcers to
check the questions generated in Step 1,
(a) whether the two requirements are
satisfied and (b) whether the question is
grammatically and logically correct.
• For valid questions, they continue to ask
crowdsourcers to give one correct answer
and one incorrect answer
4. MCTACO
• For those candidates that represent
events, the previously-mentioned token-
level perturbations rarely lead to
interesting and diverse set of candidate
answers. It may lead to invalid phrases
(e.g., “he left the house” “he walked
the house”.) Therefore, to perturb such
candidates, they create a pool of 60k
event phrases using PropBank and
perturb the candidate answers to be the
most similar ones extracted by an
information retrieval(IR) system.
• Expand the candidate answer set to 20
candidates per question.
• Step3. Candidate answer expansion
• Until this stage, they have collected a small
set of candidate answers (3 positive and 2
negative) for each question.
• Automatically expand this set in three ways
• Use a set of rules to extract numbers
and quantities (“2”, “once”) and temporal
terms (e.g. “a.m.”, “1990”, “afternoon”,
“day”), and then randomly perturb them
based on a list of temporal
units(“second”), adjectives (“early”),
points (“a.m.”) and adverbs (“always”).
( “2 a.m.” “3 p.m.”, “1 day” “10
days”, “once a week” “twice a month”)
• Mask each individual token in a
candidate answer (one at a time) and
use BERT to predict replacements for
each missing term; they rank those
predictions by the confidence level of
BERT and keep the top three.
• Step4. Answer labeling
• Each (sentence, question, answer) tuple
produced earlier is labeled by 4
crowdsourcers, with three options: “likely”,
“unlikely”, or “invalid”.
• A tuple is kept only if all 4 annotators
agree on “likely” or “unlikely”.
6. TORQUE
• Time is important for understanding events and stories
described in natural language text.
• “he won the championship yesterday” is different from “he
will win the championship tomorrow” (explicit)
• If we read that a woman is “expecting the birth of her first
child”, we know that the birth is in the future, while if she
is “mourning the death of her mother”, the death is in the
past. (implicit)
• These relationships between an event and a time point
(e.g. “won the championship yesterday”) or between two
events (e.g., “expecting” is before “birth” and “mourning”
is after “death”) are called temporal relations.
7. TORQUE
• Challenges of RC for temporal relationships
1. Reading comprehension works rarely require event understanding. Most datasets largely only
require an understanding of predicates and arguments, and would ask questions like “what was a
woman trapped in?”. But a temporal relation question would be “what started before a woman was
trapped?” To answer it, the system needs to identify events (e.g., LANDSLIDE is an event and “body”
is not), the time of these events (e.g., LANDSLIDE is correct answer, while SAID is not because of
the time when the two events happen), and look at the entire passage rather than the local
predicate-argument structures within a sentence (e.g., SNOW and RAINFALL are correct answers to
the question above).
2. There are many events in a typical passage of text, so tempral relation questions typically query
more than one relationship at the same time. This means that a question can have multiple
answers (e.g., “what happened after the landslide?”), or no answers, because the question may be
beyond the time scope (e.g., “what happened before the snow started?”)
8. TORQUE
3. Temporal relations queried by natural language questions are often sensitive to a few key words
such as before, after, and start. Those questions can easily be changed to make contrasting
questions with dramatically different answers. Those questions can easily be changed to make
contrasting questions with dramatically different answers. Models that are not sensitive to these small
changes in question words will perform poorly on this task.
landslide
searching, said, found
searching
No answers
Causing, disruption, brining,
flooding, searching, trapped,
landslide, said, found
Landslide, trapped, found, said,
disruption, flooding
Landslide, trapped
said
No answers
9. TORQUE
• Annotate 3.2k text snippets randomly selected from the TempEval3 dataset.
• TORQUE has 25k events and 21k user-generated and fully answered temporal relation questions.
• RoBERTa-large achieves 51% in exact match on TORQUE after fine-tuning, about 30% behind
human performance.
• Generally speaking, an event involves a predicate and its arguments.
• When studying time, events were defined as actions/states triggered by verbs, adjectives, and
nominals.
• This work follows this line of event definition and uses event and event trigger interchangeably.
• Define an event to be either a verb or a noun.
• In copular constructions, they choose to label the verb as the event, instead of an adjective or
preposition. (for consistent treatment of “she was on the east coast yesterday” and “she was happy”
– easily teach to crowd workers) (Note that from the perspective of data collection, labeling the
copula does not lose information as one can always do post-processing using dependency parsing
or semantic role labeling to recover the connection between “was” and “happy”.)
Events
10. TORQUE
• Events expressed in text are not always factual. They
can be negated, uncertain, hypothetical or have
associated modalities.
• Prior work dealing with events often tried to
categorize and label these various aspects because
they were crucial for determining temporal relation.
• Simply have people label all events, irrespective of
their modality, and use natural language to describe
relations between them.
Events
11. TORQUE
Temporal Relations
• The relationship between two events with respect to time, or
between one event and a fixed time point.
• (A, r, B) – A and B are events or time points, and r is a
temporal relation. (e.g. (HAD, happened before, SLEPT) – first
sentence in Fig. 3)
• In previous works, every event is assumed to be associated
with a time interval. When comparing two events, there are
13 possible relation labels.
• There are still many relations that cannot be expressed
because the assumption that every event has a time interval
is inaccurate: The time scope of an event may be fuzzy, an
event can have a non-factual modality, or events can be
repetitive and invoke multiple intervals.
• To better handle these phenomena, they use natural
language to annotate the relationships between events.
12. TORQUE
Natural Language Annotation of Temporal Relations
• (A, r, B): a temporal relation between two events
• (?, r, B) : a temporal relation question
• (?, happened before, SLEPT): natural language
expression “what happened before a lion slept?”
• (A, r, B) holds, assuming for any deictic expression A
or B the time point when the passage was written,
and assuming that the passage is true.
13. TORQUE
Advantages of Natural Language Annotation
• DISRUPTION and FLOODING happened at about
the same time, but we do not know for sure which
one is earlier, so we have to choose vague.
• SNOW and DISRUPTION, we do not know which
one ends earlier and have to choose vague.
• The question-answer (QA) pairs can naturally
capture these fuzzy relations.
14. TORQUE
Advantages of Natural Language Annotation
• Natural language questions can conveniently
incorporate different modes of events.
• ▲ the relation between “having a meal”, and
“sleeping”
• If we could only choose one label, we must
choose before for all these relations, although
these relations are actually different.
• a repetitive event may be a series of intervals
rather than a single one, and often before is
very different from before.
15. TORQUE
Advantages of Natural Language Annotation
• The format of natural language questions
bypasses the need for explicit annotation of
properties of events or other theories.
• The annotator naturally avoids event pairs that
do not have relations.
• “what happened after the service
industries are hardest hit?”
• “what happened after a passerby reported
the body?”
• “what was expected to happen when the
crisis hit America?”
• “what was supposed to happen after a
passerby called the police?”
• It still remains difficult to have a theory
explaining
• why hit can compare to expected and crisis,
but not to gains.
16. TORQUE
Penalize Shortcuts by Contrast Sets
• An important problem in building datasets is to
avoid trivial solutions.
• Contrast questions: which slightly modify the
original questions, but dramatically change the
answers
• For an existing question (?, r, B) (e.g., “what
happened after he ate his breakfast?”)
• Keep using B and change r (e.g., “what
happened before/shortly after/… he ate his
breakfast?”)
• Modify it to ask about the start/end time
(e.g., “what happened after he started
eating his breakfast?” or “what would finish
after he ate his breakfast?”)
• Check that the answers to the new question
are different from the original one to avoid
trivial modifications (e.g., changing “what
happened” to “what occurred”)
17. TORQUE
Data Collection
• Passages that consist of two contiguous
sentences, as this is sufficient to capture the
vast majority of non-trivial temporal relations.
• Create a pool of 26k two-sentence passages
from the TempEval3 workshop (2.8k articles)
• 1. Label all the events
• 2. Repeatedly do the following
• (a) Ask a temporal relation question and
point out all the answers from the list of
events
• Modify the temporal relation to create
one or more new questions and answer
them.
Quality Control
• Qualification: crowd workers were trained and
tested on 3 capabilities: labeling events,
asking temporal relation questions, and
question answering. Crowd workers were
considered level-1 qualified if they could pass
the test within 3 attempts. (1/3 workers
passed the qualification.)
• Pilot: asked level-1 crowd workers to do a
small amount of the real task. They manually
checked the annotations and gave feedback
to them. Roughly 1 out of 3 pilot submissions
received a level-2 qualification. In the end,
there were 63 level-2 annotators, and 60 of
them actually worked on large-scale task.
• Validation: 20% of articles. 5 different level-2
annotators(include original annotator) validate
the event and answers. They intentionally
added noise to the original data for quality
control. They did not do additional validation
for the question because there is no bad
questions in a random sample of 100.
Quality Control
18. TORQUE
Cost
• 3 passages were presented.
• The crowd worker could decide to use some
or all of the.
• For each passage a worker decided to use,
they needed to label the vents, answer 3 hard-
coded warm-up questions, and them ask and
answer at least 12 questions (including
contrast questions). The final reward is a base
pay of $6 plus $0.5 for each extra question (up
to $4).
• Incentive
• (1) use fewer passages so that they can
do event labeling and warm-up questions
fewer times.
• (2) modify questions instead of asking
from scratch
• (3) ask extra questions in each job.
• In practice, crowd workers on average used 2
passages in each job.
• Validating the events in each passage and the
answers to a specific question both cost $0.1.
• In total, TORQUE cost $15k for an average of
$0.7/question.
statistics
• 3.2k passage annotations (~50 tokens/passage)
• 24.9k events (7.9 events/passage)
• 21.2k user-provided questions (half of them
were labeled by crowd workers as
modifications of existing ones)
• 94 / 200 questions querying about relations
that cannot be directly represented by the
previous single-interval-based labels.
23. TRACIE: Temporal Reasoning on Implicit Events from Distant Supervision
• When reading a story, a human can construct
a latent timeline about events’ start and end
times.
• The timeline not only contains the placements
of explicitly mentioned events (e.g., ride a
bicycle), but also accounts for implicit events
(e.g. Farrah was distracted so she looked away).
• The ability to construct such a timeline is
essential for understanding the causal
dynamics of a situation.
• Contributions
• A temporal relation dataset TRACIE
focusing on implicit events
• A distant supervision process for temporal
understanding of implicit events
• A reasoning model that makes end-time
comparisons using predictions of start-
time distances and durations
24. TRACIE: Temporal Reasoning on Implicit Events from Distant Supervision
• Such tests in TRACIE take the form of multi-premise textual entailment (TE)
• Each TRACIE instance contains
• A context story (or premise) consisting of a sequence of explicit narrative events
• An implicit event in the form of a natural language phrase that is unmentioned but has
some role in the story
• An explicit event also in the form of a phrase
• A comparator of either {starts, ends}
• A temporal relation of either {before, after} that marks the relationship in the dimension
defined by the comparator between the implicit-event and the explicit-event
25. TRACIE: Temporal Reasoning on Implicit Events from Distant Supervision
• Such tests in TRACIE take the form of multi-premise textual entailment (TE)
• Premise: context story
• Hypotheses: temporal queries about pair-wise relations between implicit and explicit events
• E.g. “avoids” is implicit-event, “starts” is the comparator, “removed” is explicit-event and “before”
is the temporal-relation.
• Flip the temporal-relation (i.e., “before” to “after” and vice versa) to create
negative(contradiction) instances.
• Use start times of explicit-events as reference points and compare the implicit-event’s start or
end time with them, according to the label definitions (Fig. 3)
26. TRACIE: Temporal Reasoning on Implicit Events from Distant Supervision
• Randomly sample short stories from the
ROCStories dataset
• For each story, one annotator writes 5 implicit
event phrases that are not explicitly mentioned
by the given story, but are inferable and
relevant.
• Additionally rewrites two explicit events closest
to the implicit event’s start and end time,
respectively.
• Build two TRACIE instances (minus the
temporal-relation) per implicit event
Implicit Event Generation
Automatic Instance Generation
• Extract all verbs and relevant arguments with
its semantic role labeling model in AllenNLP
• Construct a pool of explicit events in the form
of short phrases (using verbs and their
arguments)
• Extract all verbs and relevant arguments with
its semantic role labeling model in AllenNLP
• Construct a pool of explicit events in the form
of short phrases (using verbs and their
arguments)
• For each implicit event, randomly select two
{explicit-event, comparator} pairs from the pool.
Label Collection
• For each of the 20 instances per story,
annotate the temporal-relation with four
different annotators.
• Majority agreement as the final label and filter
out unagreeable instances.
• Two authors additionally verify the instances
with ambiguous verbs(e.g., “have”) and
corrected 5% of the end-time instances.
27. TRACIE: Temporal Reasoning on Implicit Events from Distant Supervision
• Distant Supervision
• Within-sentence Extraction
• Collect start time comparisons between pairs of events heuristically from free-text using
“before/after” keywords
• Use AllenNLP’s SRL model to process each input sentence and find verbs with a temporal
argument that starts with either “before” or “after”, and contains at least another verb.
• If there are multiple verbs in the temporal argument, take the one with the largest number
of tokens as arguments.
• 2.8M instances from Wikipedia dump(May 2020)
Pattern-Based Pre-Training
28. TRACIE: Temporal Reasoning on Implicit Events from Distant Supervision
• Distant Supervision
• Cross-sentence Extraction
• The data collected from the within-sentence patterns
does not reveal the relative distance between two
start times.
• Finds direct temporal expressions of hours and dates.
• Because these temporal expressions(e.g., 2021-01-01)
are globally comparable, the compared events can be
anywhere in a document.
• This process collects more supervision signals about
time-point comparisons and their relative distance
on event pairs with trivial causal relation.
• Find exact temporal values by filling unmentioned
elements of a temporal expression with the nearest
previous mention (e.g., add “January to the expression
of “the 10th” in Fig. 4)
Pattern-Based Pre-Training (PTNTIME)
29. TRACIE: Temporal Reasoning on Implicit Events from Distant Supervision
• Cross-sentence Extraction
• Construct supervision instances under the assumption
that the extracted temporal expressions describe the
start times of the associated verbs (e.g., went started
on January 1st )
• Represent the differences between the two start times
as one of seven coarse temporal units: {<=minutes,
hours, days, weeks, months, years, >= decades}
• Go to park is weeks before write review as shown in
Fig. 4
• Couple the specialized temporal pre-training data
described above with additional paragraphs that are used
to perform conventional language model pretraining using
the original denoising task (T5).
• Input sequences of event : [EventA] starts [Relation]
[EventB] . Story: [Paragraph] and output sequences of
answer: [Label] [Distance] . [paragraph]: non empty only
for cross-sentence extractions. [label] is either positive or
negative. [distance] is one of the 7 coarse temporal units
represented with a set of blank tokens [extra_id_N]
30. TRACIE: Temporal Reasoning on Implicit Events from Distant Supervision
• This model makes end-time comparisons by symbolically combining start time distance and
duration from separate predictions based on some of the components.
• Does not rely on explicit annotations on timepoints, but only relative comparisons between
them.
Symbolic Temporal Reasoning Model (SYSTIME)
31. TRACIE: Temporal Reasoning on Implicit Events from Distant Supervision
Symbolic Temporal Reasoning Model (SYSTIME)
32. TRACIE: Temporal Reasoning on Implicit Events from Distant Supervision
Duration estimation – pretrain sequence-to-sequence model
𝑟𝑒𝑛𝑑𝑠 𝑒1, 𝑒2 = 𝑏𝑒𝑓𝑜𝑟𝑒 ⇔ 𝑑𝑖𝑠𝑡 𝑒1, 𝑒2 + 𝑑𝑢𝑟 𝑒1 < 0
𝑟𝑒𝑛𝑑𝑠 𝑒1, 𝑒2 = 𝑎𝑓𝑡𝑒𝑟 ⇔ 𝑑𝑖𝑠𝑡 𝑒1, 𝑒2 + 𝑑𝑢𝑟 𝑒1 > 0
𝑟𝑠𝑡𝑎𝑟𝑡𝑠 𝑒1, 𝑒2 = 𝑏𝑒𝑓𝑜𝑟𝑒 ⇔ 𝑑𝑖𝑠𝑡 𝑒1, 𝑒2 < 0
𝑟𝑠𝑡𝑎𝑟𝑡𝑠 𝑒1, 𝑒2 = 𝑎𝑓𝑡𝑒𝑟 ⇔ 𝑑𝑖𝑠𝑡 𝑒1, 𝑒2 > 0
• Use duration data from TimeM (1M events and
duration values)
• Input sequence event: [Event] story: [Story]
• Output sequence answer: [Value]
• [Event] represents the tokens of an event with
the trigger verb marked by a special token to its
left
• [Story] represents down tokens from the context
• [Value] is one of the 7unit labels (i.e., {<= minuts,
hours, …})
33. TRACIE: Temporal Reasoning on Implicit Events from Distant Supervision
Approximate dist() function using output from PTNTIME
• Input sequences of event : [EventA] starts
[Relation] [EventB] . Story: [Paragraph] and
output sequences of answer: [Label] [Distance] .
• [EventA]: the texture description of e1
• [EventB]: the texture description of e2
• [Paragraph] the context (premise)
• Fix [Relation] to be before.
• By taking the values of the vocabulary indices
corresponding to “positive” and “negative” from
the logits of [Label] and applying a softmax
operation, get P_before, P_after. P = [P_before,
P_after]
• Apply softmax to the logits of [Distance] over the
7words representing the temporal units to obtain
7 values that approximate the prob. of distance.
Place the 7 values in temporal units’ increasing
order in vector d. c = [0, 1, 2, 3, 4, 5, 6]
• To get the direction, apply the tanh function to
the difference between the prob. in p.
34. TRACIE: Temporal Reasoning on Implicit Events from Distant Supervision
• T5-Large for PTNTIME and the duration model.
• PTNTIME – 45k steps(1.4M instances), duration
model – 80k steps(2.6M instances)
• These pretrained weights in SYSTIME: SYSTIME
ZEROSHOT wich uses no TRACIE supervision.
• Story-wide exact match metric, which is the
percentage of stories with all its related
hypotheses answered correctly.
35. TRACIE: Temporal Reasoning on Implicit Events from Distant Supervision
• Uniform-dist: in the i.i.d. training set, 70% of the examples with the comparator ends and relation after are
positive. – randomly remove instances from the majority classes
36. TRACIE: Temporal Reasoning on Implicit Events from Distant Supervision
• Train and evaluate only the instances with a label
of either “before” or “after”, which accounts for
about 80% of all instances.
• OT-NS(original test, no story): train and test with
only the sentences containing the trigger verbs
• OT: train and test with the entire document as an
auxiliary input
• OT-MS(original test, minimal supervision): train
with 1.2k (6%) training instances
• PT(perturbed test): train with the complete
training set and test on a perturbed test set from
Evaluating Models’ Local Decision Boundaries via
Contrast Sets.