The document discusses various topics in natural language processing and knowledge representation techniques, including conceptual dependency theory, script structures, the CYC theory, case grammars, and the semantic web. It provides information on each topic through a series of slides by Madhav Mishra, describing things like the components of scripts, features and examples of CYC knowledge base, how semantic web uses XML, RDF and ontologies, and an overview of case grammars and their use of functional relationships between nouns and verbs.
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Madhav Mishra
The document discusses various topics related to evolutionary computation and artificial intelligence, including:
- Evolutionary computation concepts like genetic algorithms, genetic programming, evolutionary programming, and swarm intelligence approaches like ant colony optimization and particle swarm optimization.
- The use of intelligent agents in artificial intelligence and differences between single and multi-agent systems.
- Soft computing techniques involving fuzzy logic, machine learning, probabilistic reasoning and other approaches.
- Specific concepts discussed in more depth include genetic algorithms, genetic programming, swarm intelligence, ant colony optimization, and metaheuristics.
Applied Artificial Intelligence Unit 2 Semester 3 MSc IT Part 2 Mumbai Univer...Madhav Mishra
This document covers probability theory and fuzzy sets and fuzzy logic, which are topics for an applied artificial intelligence unit. It discusses key concepts for probability theory including joint probability, conditional probability, and Bayes' theorem. It also covers fuzzy sets and fuzzy logic, including fuzzy set operations, types of membership functions, linguistic variables, and fuzzy propositions and inference rules. Examples are provided throughout to illustrate probability and fuzzy set concepts. The document is presented as a slideshow with explanatory text and diagrams on each slide.
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...Madhav Mishra
The document discusses machine learning paradigms including supervised learning, unsupervised learning, clustering, artificial neural networks, and more. It then discusses how supervised machine learning works using labeled training data for tasks like classification and regression. Unsupervised learning is described as using unlabeled data to find patterns and group data. Semi-supervised learning uses some labeled and some unlabeled data. Reinforcement learning provides rewards or punishments to achieve goals. Inductive learning infers functions from examples to make predictions for new examples.
Applied Artificial Intelligence Unit 1 Semester 3 MSc IT Part 2 Mumbai Univer...Madhav Mishra
The document discusses Applied Artificial Intelligence and covers 5 topics:
1) A review of the history and foundations of AI including key developments from 1950-1980.
2) Expert systems and their applications, including the phases of building an expert system.
3) The typical architecture of an expert system including the knowledge base, inference engine, and user interface.
4) How expert systems differ from traditional systems in their use of knowledge versus just data.
5) Various applications of AI in areas like business, engineering, manufacturing, and education.
This document provides a 50-hour roadmap for building large language model (LLM) applications. It introduces key concepts like text-based and image-based generative AI models, encoder-decoder models, attention mechanisms, and transformers. It then covers topics like intro to image generation, generative AI applications, embeddings, attention mechanisms, transformers, vector databases, semantic search, prompt engineering, fine-tuning foundation models, orchestration frameworks, autonomous agents, bias and fairness, and recommended LLM application projects. The document recommends several hands-on exercises and lists upcoming bootcamp dates and locations for learning to build LLM applications.
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Madhav Mishra
The document discusses various topics related to evolutionary computation and artificial intelligence, including:
- Evolutionary computation concepts like genetic algorithms, genetic programming, evolutionary programming, and swarm intelligence approaches like ant colony optimization and particle swarm optimization.
- The use of intelligent agents in artificial intelligence and differences between single and multi-agent systems.
- Soft computing techniques involving fuzzy logic, machine learning, probabilistic reasoning and other approaches.
- Specific concepts discussed in more depth include genetic algorithms, genetic programming, swarm intelligence, ant colony optimization, and metaheuristics.
Applied Artificial Intelligence Unit 2 Semester 3 MSc IT Part 2 Mumbai Univer...Madhav Mishra
This document covers probability theory and fuzzy sets and fuzzy logic, which are topics for an applied artificial intelligence unit. It discusses key concepts for probability theory including joint probability, conditional probability, and Bayes' theorem. It also covers fuzzy sets and fuzzy logic, including fuzzy set operations, types of membership functions, linguistic variables, and fuzzy propositions and inference rules. Examples are provided throughout to illustrate probability and fuzzy set concepts. The document is presented as a slideshow with explanatory text and diagrams on each slide.
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...Madhav Mishra
The document discusses machine learning paradigms including supervised learning, unsupervised learning, clustering, artificial neural networks, and more. It then discusses how supervised machine learning works using labeled training data for tasks like classification and regression. Unsupervised learning is described as using unlabeled data to find patterns and group data. Semi-supervised learning uses some labeled and some unlabeled data. Reinforcement learning provides rewards or punishments to achieve goals. Inductive learning infers functions from examples to make predictions for new examples.
Applied Artificial Intelligence Unit 1 Semester 3 MSc IT Part 2 Mumbai Univer...Madhav Mishra
The document discusses Applied Artificial Intelligence and covers 5 topics:
1) A review of the history and foundations of AI including key developments from 1950-1980.
2) Expert systems and their applications, including the phases of building an expert system.
3) The typical architecture of an expert system including the knowledge base, inference engine, and user interface.
4) How expert systems differ from traditional systems in their use of knowledge versus just data.
5) Various applications of AI in areas like business, engineering, manufacturing, and education.
This document provides a 50-hour roadmap for building large language model (LLM) applications. It introduces key concepts like text-based and image-based generative AI models, encoder-decoder models, attention mechanisms, and transformers. It then covers topics like intro to image generation, generative AI applications, embeddings, attention mechanisms, transformers, vector databases, semantic search, prompt engineering, fine-tuning foundation models, orchestration frameworks, autonomous agents, bias and fairness, and recommended LLM application projects. The document recommends several hands-on exercises and lists upcoming bootcamp dates and locations for learning to build LLM applications.
And then there were ... Large Language ModelsLeon Dohmen
It is not often even in the ICT world that one witnesses a revolution. The rise of the Personal Computer, the rise of mobile telephony and, of course, the rise of the Internet are some of those revolutions. So what is ChatGPT really? Is ChatGPT also such a revolution? And like any revolution, does ChatGPT have its winners and losers? And who are they? How do we ensure that ChatGPT contributes to a positive impulse for "Smart Humanity?".
During a key note om April 3 and 13 2023 Piek Vossen explained the impact of Large Language Models like ChatGPT.
Prof. PhD. Piek Th.J.M. Vossen, is Full professor of Computational Lexicology at the Faculty of Humanities, Department of Language, Literature and Communication (LCC) at VU Amsterdam:
What is ChatGPT? What technology and thought processes underlie it? What are its consequences? What choices are being made? In the presentation, Piek will elaborate on the basic principles behind Large Language Models and how they are used as a basis for Deep Learning in which they are fine-tuned for specific tasks. He will also discuss a specific variant GPT that underlies ChatGPT. It covers what ChatGPT can and cannot do, what it is good for and what the risks are.
The document provides an overview of transformers, large language models (LLMs), and artificial general intelligence (AGI). It discusses the architecture and applications of transformers in natural language processing. It describes how LLMs have evolved from earlier statistical models and now perform state-of-the-art results on NLP tasks through pre-training and fine-tuning. The document outlines the capabilities of GPT-3, the largest LLM to date, as well as its limitations and ethical concerns. It introduces AGI and the potential for such systems to revolutionize AI, while also noting the technical, ethical and societal challenges to developing AGI.
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...David Talby
An April 2023 presentation to the AMIA working group on natural language processing. The talk focuses on three current trends in NLP and how they apply in healthcare: Large language models, No-code, and Responsible AI.
The document discusses transformer and BERT models. It provides an overview of attention models, the transformer architecture, and how transformer models work. It then introduces BERT, explaining how it differs from transformer models in that it does not use a decoder and is pretrained using two unsupervised tasks. The document outlines BERT's architecture and embeddings. Pretrained BERT models are discussed, including DistilBERT, RoBERTa, ALBERT and DeBERTa.
This document provides an overview of BERT (Bidirectional Encoder Representations from Transformers) and how it works. It discusses BERT's architecture, which uses a Transformer encoder with no explicit decoder. BERT is pretrained using two tasks: masked language modeling and next sentence prediction. During fine-tuning, the pretrained BERT model is adapted to downstream NLP tasks through an additional output layer. The document outlines BERT's code implementation and provides examples of importing pretrained BERT models and fine-tuning them on various tasks.
A Comprehensive Review of Large Language Models for.pptxSaiPragnaKancheti
The document presents a review of large language models (LLMs) for code generation. It discusses different types of LLMs including left-to-right, masked, and encoder-decoder models. Existing models for code generation like Codex, GPT-Neo, GPT-J, and CodeParrot are compared. A new model called PolyCoder with 2.7 billion parameters trained on 12 programming languages is introduced. Evaluation results show PolyCoder performs less well than comparably sized models but outperforms others on C language tasks. In general, performance improves with larger models and longer training, but training solely on code can be sufficient or advantageous for some languages.
Soft computing is an emerging approach to computing that aims to solve computationally hard problems using inexact solutions that are tolerant of imprecision, uncertainty, partial truth, and approximation. It uses techniques like fuzzy logic, neural networks, evolutionary computation, and probabilistic reasoning to model human-like decision making. Unlike hard computing which requires precise modeling and solutions, soft computing is well-suited for real-world problems where ideal models are not available. The key constituents of soft computing are fuzzy logic, evolutionary computation, neural networks, and machine learning.
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, as well as critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we will first motivate the need for model interpretability and explainability in AI from societal, legal, customer/end-user, and model developer perspectives. [Note: Due to time constraints, we will not focus on techniques/tools for providing explainability as part of AI/ML systems.] Then, we will focus on the real-world application of explainability techniques in industry, wherein we present practical challenges / implications for using explainability techniques effectively and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We will present case studies across different companies, spanning application domains such as search and recommendation systems, sales, lending, and fraud detection. Finally, based on our experiences in industry, we will identify open problems and research directions for the research community.
AI, Machine Learning, and Data Science ConceptsDan O'Leary
An overview of AI, Machine Learning, and Data Science concepts, contrasting popular conceptions of AI to state-of-the-art methods in Data Science. An introduction to Machine Learning will compare supervised and unsupervised methods, give high-level descriptions of key methods, and discuss current use cases and trends.
Web version of presentation given to the Data Science Society of Auburn, a mix of undergraduate and graduate students interested in Data Science.
The document discusses knowledge based systems (KBS), including an overview of KBS, their development process, and some famous early expert systems like DENDRAL and MYCIN. Key aspects of developing a KBS include knowledge acquisition, representation, inference, and modeling uncertainty. The KBS lifecycle involves feasibility studies, requirements analysis, system design, coding, testing, and implementation.
This talk is about how we applied deep learning techinques to achieve state-of-the-art results in various NLP tasks like sentiment analysis and aspect identification, and how we deployed these models at Flipkart
Dendral was an early expert system developed in the 1960s at Stanford University to help organic chemists identify unknown organic molecules. It used mass spectrometry data and knowledge of chemistry to generate possible chemical structures. Dendral included both Heuristic Dendral, which produced candidate structures, and Meta Dendral, a machine learning system that proposed mass spectrometry rules relating structure to spectra. The project pioneered the use of heuristics programming and helped establish artificial intelligence approaches like the plan-generate-test problem solving paradigm. Many subsequent expert systems were influenced by Dendral.
Introduction to Natural Language ProcessingPranav Gupta
the presentation gives a gist about the major tasks and challenges involved in natural language processing. In the second part, it talks about one technique each for Part Of Speech Tagging and Automatic Text Summarization
The document discusses the importance of context for developing responsible artificial intelligence (AI) systems. It provides examples of AI systems that lacked proper context and oversight, which led to harmful or inappropriate behaviors. Specifically, it discusses how graphs can help address these issues by providing AI with more contextual data and connections to learn from. This allows for more accurate, fair, explainable and trustworthy AI solutions. The document advocates for incorporating adjacent information as context for AI using knowledge graphs, which will help drive reliable AI and become a standard approach.
Fine-tuning BERT for Question AnsweringApache MXNet
This deck covers the problem of fine-tuning a pre-trained BERT model for the task of Question Answering. Check out the GluonNLP model zoo here for models and tutorials: http://gluon-nlp.mxnet.io/model_zoo/bert/index.html
Slides: Thomas Delteil
Designing Human-Centered AI Products & SystemsUday Kumar
AI technology and tools are far outpacing AI product design and experience. This presentation attempts to highlight the risks of poor AI product design and propose a set of principles for teams to move towards good/responsible AI systems.
Note: The presentation was first made at GDG-DC AI/ML meetup at Capital One, McLean, VA on 02/28/2019.
Adopting Data Science and Machine Learning in the financial enterpriseQuantUniversity
Financial firms are taking AI and machine learning seriously to augment traditional investment decision making. Alternative datasets including text analytics, cloud computing, algorithmic trading are game changers for many firms who are adopting technology at a rapid pace. As more and more open-source technologies penetrate enterprises, quants and data scientists have a plethora of choices for building, testing and scaling quantitative models. Even though there are multiple solutions and platforms available to build machine learning solutions, challenges remain in adopting machine learning in the enterprise.In this talk we will illustrate a step-by-step process to enable replicable AI/ML research within the enterprise using QuSandbox.
An Introduction to Semantic Web TechnologyAnkur Biswas
The document provides an overview of the semantic web and some of its key challenges. It discusses:
1) The evolution of the world wide web from a web of documents to a web of linked data through technologies like RDF, OWL, and SPARQL that add semantic meaning.
2) The vision for the semantic web is to publish machine-readable data using common formats so that information can be automatically processed by agents and integrated across sources.
3) Some challenges in realizing this vision include dealing with implicit knowledge, heterogeneous data distributions, and maintaining links and correctness over time as data changes.
And then there were ... Large Language ModelsLeon Dohmen
It is not often even in the ICT world that one witnesses a revolution. The rise of the Personal Computer, the rise of mobile telephony and, of course, the rise of the Internet are some of those revolutions. So what is ChatGPT really? Is ChatGPT also such a revolution? And like any revolution, does ChatGPT have its winners and losers? And who are they? How do we ensure that ChatGPT contributes to a positive impulse for "Smart Humanity?".
During a key note om April 3 and 13 2023 Piek Vossen explained the impact of Large Language Models like ChatGPT.
Prof. PhD. Piek Th.J.M. Vossen, is Full professor of Computational Lexicology at the Faculty of Humanities, Department of Language, Literature and Communication (LCC) at VU Amsterdam:
What is ChatGPT? What technology and thought processes underlie it? What are its consequences? What choices are being made? In the presentation, Piek will elaborate on the basic principles behind Large Language Models and how they are used as a basis for Deep Learning in which they are fine-tuned for specific tasks. He will also discuss a specific variant GPT that underlies ChatGPT. It covers what ChatGPT can and cannot do, what it is good for and what the risks are.
The document provides an overview of transformers, large language models (LLMs), and artificial general intelligence (AGI). It discusses the architecture and applications of transformers in natural language processing. It describes how LLMs have evolved from earlier statistical models and now perform state-of-the-art results on NLP tasks through pre-training and fine-tuning. The document outlines the capabilities of GPT-3, the largest LLM to date, as well as its limitations and ethical concerns. It introduces AGI and the potential for such systems to revolutionize AI, while also noting the technical, ethical and societal challenges to developing AGI.
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...David Talby
An April 2023 presentation to the AMIA working group on natural language processing. The talk focuses on three current trends in NLP and how they apply in healthcare: Large language models, No-code, and Responsible AI.
The document discusses transformer and BERT models. It provides an overview of attention models, the transformer architecture, and how transformer models work. It then introduces BERT, explaining how it differs from transformer models in that it does not use a decoder and is pretrained using two unsupervised tasks. The document outlines BERT's architecture and embeddings. Pretrained BERT models are discussed, including DistilBERT, RoBERTa, ALBERT and DeBERTa.
This document provides an overview of BERT (Bidirectional Encoder Representations from Transformers) and how it works. It discusses BERT's architecture, which uses a Transformer encoder with no explicit decoder. BERT is pretrained using two tasks: masked language modeling and next sentence prediction. During fine-tuning, the pretrained BERT model is adapted to downstream NLP tasks through an additional output layer. The document outlines BERT's code implementation and provides examples of importing pretrained BERT models and fine-tuning them on various tasks.
A Comprehensive Review of Large Language Models for.pptxSaiPragnaKancheti
The document presents a review of large language models (LLMs) for code generation. It discusses different types of LLMs including left-to-right, masked, and encoder-decoder models. Existing models for code generation like Codex, GPT-Neo, GPT-J, and CodeParrot are compared. A new model called PolyCoder with 2.7 billion parameters trained on 12 programming languages is introduced. Evaluation results show PolyCoder performs less well than comparably sized models but outperforms others on C language tasks. In general, performance improves with larger models and longer training, but training solely on code can be sufficient or advantageous for some languages.
Soft computing is an emerging approach to computing that aims to solve computationally hard problems using inexact solutions that are tolerant of imprecision, uncertainty, partial truth, and approximation. It uses techniques like fuzzy logic, neural networks, evolutionary computation, and probabilistic reasoning to model human-like decision making. Unlike hard computing which requires precise modeling and solutions, soft computing is well-suited for real-world problems where ideal models are not available. The key constituents of soft computing are fuzzy logic, evolutionary computation, neural networks, and machine learning.
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, as well as critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we will first motivate the need for model interpretability and explainability in AI from societal, legal, customer/end-user, and model developer perspectives. [Note: Due to time constraints, we will not focus on techniques/tools for providing explainability as part of AI/ML systems.] Then, we will focus on the real-world application of explainability techniques in industry, wherein we present practical challenges / implications for using explainability techniques effectively and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We will present case studies across different companies, spanning application domains such as search and recommendation systems, sales, lending, and fraud detection. Finally, based on our experiences in industry, we will identify open problems and research directions for the research community.
AI, Machine Learning, and Data Science ConceptsDan O'Leary
An overview of AI, Machine Learning, and Data Science concepts, contrasting popular conceptions of AI to state-of-the-art methods in Data Science. An introduction to Machine Learning will compare supervised and unsupervised methods, give high-level descriptions of key methods, and discuss current use cases and trends.
Web version of presentation given to the Data Science Society of Auburn, a mix of undergraduate and graduate students interested in Data Science.
The document discusses knowledge based systems (KBS), including an overview of KBS, their development process, and some famous early expert systems like DENDRAL and MYCIN. Key aspects of developing a KBS include knowledge acquisition, representation, inference, and modeling uncertainty. The KBS lifecycle involves feasibility studies, requirements analysis, system design, coding, testing, and implementation.
This talk is about how we applied deep learning techinques to achieve state-of-the-art results in various NLP tasks like sentiment analysis and aspect identification, and how we deployed these models at Flipkart
Dendral was an early expert system developed in the 1960s at Stanford University to help organic chemists identify unknown organic molecules. It used mass spectrometry data and knowledge of chemistry to generate possible chemical structures. Dendral included both Heuristic Dendral, which produced candidate structures, and Meta Dendral, a machine learning system that proposed mass spectrometry rules relating structure to spectra. The project pioneered the use of heuristics programming and helped establish artificial intelligence approaches like the plan-generate-test problem solving paradigm. Many subsequent expert systems were influenced by Dendral.
Introduction to Natural Language ProcessingPranav Gupta
the presentation gives a gist about the major tasks and challenges involved in natural language processing. In the second part, it talks about one technique each for Part Of Speech Tagging and Automatic Text Summarization
The document discusses the importance of context for developing responsible artificial intelligence (AI) systems. It provides examples of AI systems that lacked proper context and oversight, which led to harmful or inappropriate behaviors. Specifically, it discusses how graphs can help address these issues by providing AI with more contextual data and connections to learn from. This allows for more accurate, fair, explainable and trustworthy AI solutions. The document advocates for incorporating adjacent information as context for AI using knowledge graphs, which will help drive reliable AI and become a standard approach.
Fine-tuning BERT for Question AnsweringApache MXNet
This deck covers the problem of fine-tuning a pre-trained BERT model for the task of Question Answering. Check out the GluonNLP model zoo here for models and tutorials: http://gluon-nlp.mxnet.io/model_zoo/bert/index.html
Slides: Thomas Delteil
Designing Human-Centered AI Products & SystemsUday Kumar
AI technology and tools are far outpacing AI product design and experience. This presentation attempts to highlight the risks of poor AI product design and propose a set of principles for teams to move towards good/responsible AI systems.
Note: The presentation was first made at GDG-DC AI/ML meetup at Capital One, McLean, VA on 02/28/2019.
Adopting Data Science and Machine Learning in the financial enterpriseQuantUniversity
Financial firms are taking AI and machine learning seriously to augment traditional investment decision making. Alternative datasets including text analytics, cloud computing, algorithmic trading are game changers for many firms who are adopting technology at a rapid pace. As more and more open-source technologies penetrate enterprises, quants and data scientists have a plethora of choices for building, testing and scaling quantitative models. Even though there are multiple solutions and platforms available to build machine learning solutions, challenges remain in adopting machine learning in the enterprise.In this talk we will illustrate a step-by-step process to enable replicable AI/ML research within the enterprise using QuSandbox.
An Introduction to Semantic Web TechnologyAnkur Biswas
The document provides an overview of the semantic web and some of its key challenges. It discusses:
1) The evolution of the world wide web from a web of documents to a web of linked data through technologies like RDF, OWL, and SPARQL that add semantic meaning.
2) The vision for the semantic web is to publish machine-readable data using common formats so that information can be automatically processed by agents and integrated across sources.
3) Some challenges in realizing this vision include dealing with implicit knowledge, heterogeneous data distributions, and maintaining links and correctness over time as data changes.
An intelligent expert system for location planning is proposed that uses semantic web technologies and a Bayesian network. The system integrates heterogeneous information through an ontology. It develops an integrated knowledge process to guide the engineering procedure. Based on a Bayesian network technique, the system recommends well-planned attractions to users.
The document discusses semantic systems and how they can help solve problems related to integrating different types of systems by facilitating interoperability. It outlines some of the key challenges, such as the lack of tools that are easy for average users while also being powerful enough for experts. The document also discusses different semantic technologies like ontologies, logic programming, and the Semantic Web that could help address these challenges if implemented properly with a focus on integration rather than fragmentation.
Precision Content™ Tools, Techniques, and Technologydclsocialmedia
This webinar will explore fundamental principles for writing and structuring content for the enterprise. Attendees will learn how to approach information typing for structured authoring for more concise and reusable content.
This document provides an overview of how to future-proof content. It discusses trends in how content is being consumed today on mobile devices and scrolling interfaces. It also looks at what the future may bring, such as more content bots, adaptive content, and new formats. The document recommends planning for the future by having a content strategy, standards, structure, schemas, consideration of size, and socializing any new models. It provides examples from the BBC and Microsoft of implementing these recommendations. The document concludes by suggesting books and resources on related topics.
The Semantic Web: What IAs Need to Know About Web 3.0Chiara Fox Ogan
The document discusses the Semantic Web and Web 3.0. It defines the Semantic Web as an extension of the current web that makes data on the web more accessible to machines. It explains key concepts needed to realize the Semantic Web like identifying resources with URIs, linking data using RDF triples, using ontologies to define relationships between concepts, and sharing structured data and ontologies. The document provides examples of how semantics are already being used in applications today and how semantics can improve search and allow new types of questions to be asked of linked data.
The document provides an overview of semantic web analytics presented by 5th semester GIBS students. It discusses the evolution of the web from version 1.0 to 3.0, with key differences explained. Version 1.0 focused on information search, 2.0 added user participation, and 3.0 aims to be a semantic web where meaning is understood. The core technologies that make up the semantic web - RDF, SPARQL and OWL - are also summarized. The future potential of the semantic web is highlighted as making internet data machine readable and enabling computers to interpret information like humans through advances in AI and ML.
Talk on Data Discovery and Metadata by Mark Grover from July 2019.
Goes into detail of the problem, build/buy/adopt analysis and Lyft's solution - Amundsen, along with thoughts on the future.
The document discusses the semantic web, which aims to make web content machine-readable through the use of metadata, XML tags, and ontologies. It explains that the semantic web will bring structure to web page meaning to allow computers to process semantics. Key components that enable the semantic web are RDF, which describes web resources through subject-property-object triples, and ontologies, which formally define relations among terms. The ultimate goal of the semantic web is to help computers comprehend semantic documents to assist the evolution of shared human knowledge.
The document discusses the emergence of real-time interaction on the Semantic Web through the use of Semantic Web agents. It describes how agents can be used to enable real-time querying and subscription to data streams with provenance tracking. The RDFAgents specification extends standards for agent communication to support these capabilities. Examples of use cases involving question answering, query delegation, real-time data updates, and syndication are provided to illustrate how agents can facilitate real-time interaction on the Semantic Web.
No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts...ACTUONDA
No more BITS - Blind Insignificant Technologies ands Systems by Roger Roberts of RTBF TITAN
Primer encuentro BIG MEDIA
Conectando Media, Audiencia y Publicidad con Datos
24 de junio 2014, Madrid
• Sponsor Platinum : Perfect Memory
• Sponsor Gold : Stratio, Paradigma
• Con el apoyo de : Big Data Spain, Medios On
• Socio tecnológico : Agora News
• Organizadores : Actuonda y Cátedra Big Data UAM-IBM
• Contacto : Nicolas Moulard (Actuonda) moulard@actuonda.com @Radio_20
www.bigmediaconnect.es
This document discusses semantic technology and how it can help analyze the large amounts of data produced every day. Semantic computing uses standards like RDF and OWL to describe data and relationships between data in a way that enables applications to understand the meaning and context of information. This semantic web approach was originally developed by Tim Berners-Lee and allows data to be queried and analyzed to reveal patterns not apparent before. The future of semantic computing involves continued refinement of these standards and incorporation of semantic tools into artificial intelligence platforms.
Strategies for integrating semantic and blockchain technologiesHéctor Ugarte
Semantic Blockchain is the use of Semantic web standards on blockchain based systems. The standards promote common data formats and exchange protocols on the blockchain, making used of the Resource Description Framework (RDF).
Ontology BLONDiE for Bitcoin and Ethereum.
Research how to extract data from Ethereum.
Research how to store RDF data on Ethereum.
Prototype DeSCA: Ethereum application.
"Semantic Integration Is What You Do Before The Deep Learning". dev.bg Machine Learning seminar, 13 May 2019.
It's well known that 80\% of the effort of a data scientist is spent on data preparation. Semantic integration is arguably the best way to spend this effort more efficiently and to reuse it between tasks, projects and organizations. Knowledge Graphs (KG) and Linked Open Data (LOD) have become very popular recently. They are used by Google, Amazon, Bing, Samsung, Springer Nature, Microsoft Academic, AirBnb… and any large enterprise that would like to have a holistic (360 degree) view of its business. The Semantic Web (web 3.0) is a way to build a Giant Global Graph, just like the normal web is a Global Web of Documents. IEEE already talks about Big Data Semantics. We review the topic of KGs and their applicability to Machine Learning.
The document discusses semantic web technology, which aims to make information on the web better understood by machines by giving data well-defined meaning. It outlines the evolution of web technologies from the initial web to the semantic web. Key aspects of semantic web technology include ontologies to define common vocabularies, semantic annotations to associate meaning with data, and reasoning capabilities to enable complex queries and analyses. Languages, tools, and applications are needed to implement these semantic web standards and make the web of linked data usable.
The document discusses the Semantic Web, which aims to develop the current web so that machines can understand the meaning of information and not just display it. It outlines some key technologies being used like XML, RDF, and ontologies to add structure and meaning to web content. This will allow software agents to perform more sophisticated tasks by processing structured, machine-readable information based on defined ontologies. The Semantic Web represents an evolution from today's web designed primarily for humans to one where machines can also comprehend and utilize web content.
The Semantic Web – A Vision Come True, or Giving Up the Great Plan?Martin Hepp
The document discusses the current state and future of the Semantic Web and linked data initiatives. It notes several successes such as the Linked Open Data cloud and schemas like Schema.org and GoodRelations. However, it argues that the original vision of the Semantic Web, which aimed to allow computers to help process information by applying structured data standards at web scale, has not fully been realized. Schemas like Schema.org focus more on information extraction than direct data consumption. The document calls for challenging assumptions through empirical analysis rather than ideological debates.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxRASHMI M G
Abnormal or anomalous secondary growth in plants. It defines secondary growth as an increase in plant girth due to vascular cambium or cork cambium. Anomalous secondary growth does not follow the normal pattern of a single vascular cambium producing xylem internally and phloem externally.
Nucleophilic Addition of carbonyl compounds.pptxSSR02
Nucleophilic addition is the most important reaction of carbonyls. Not just aldehydes and ketones, but also carboxylic acid derivatives in general.
Carbonyls undergo addition reactions with a large range of nucleophiles.
Comparing the relative basicity of the nucleophile and the product is extremely helpful in determining how reversible the addition reaction is. Reactions with Grignards and hydrides are irreversible. Reactions with weak bases like halides and carboxylates generally don’t happen.
Electronic effects (inductive effects, electron donation) have a large impact on reactivity.
Large groups adjacent to the carbonyl will slow the rate of reaction.
Neutral nucleophiles can also add to carbonyls, although their additions are generally slower and more reversible. Acid catalysis is sometimes employed to increase the rate of addition.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...Sérgio Sacani
Context. With a mass exceeding several 104 M⊙ and a rich and dense population of massive stars, supermassive young star clusters
represent the most massive star-forming environment that is dominated by the feedback from massive stars and gravitational interactions
among stars.
Aims. In this paper we present the Extended Westerlund 1 and 2 Open Clusters Survey (EWOCS) project, which aims to investigate
the influence of the starburst environment on the formation of stars and planets, and on the evolution of both low and high mass stars.
The primary targets of this project are Westerlund 1 and 2, the closest supermassive star clusters to the Sun.
Methods. The project is based primarily on recent observations conducted with the Chandra and JWST observatories. Specifically,
the Chandra survey of Westerlund 1 consists of 36 new ACIS-I observations, nearly co-pointed, for a total exposure time of 1 Msec.
Additionally, we included 8 archival Chandra/ACIS-S observations. This paper presents the resulting catalog of X-ray sources within
and around Westerlund 1. Sources were detected by combining various existing methods, and photon extraction and source validation
were carried out using the ACIS-Extract software.
Results. The EWOCS X-ray catalog comprises 5963 validated sources out of the 9420 initially provided to ACIS-Extract, reaching a
photon flux threshold of approximately 2 × 10−8 photons cm−2
s
−1
. The X-ray sources exhibit a highly concentrated spatial distribution,
with 1075 sources located within the central 1 arcmin. We have successfully detected X-ray emissions from 126 out of the 166 known
massive stars of the cluster, and we have collected over 71 000 photons from the magnetar CXO J164710.20-455217.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...Wasswaderrick3
In this book, we use conservation of energy techniques on a fluid element to derive the Modified Bernoulli equation of flow with viscous or friction effects. We derive the general equation of flow/ velocity and then from this we derive the Pouiselle flow equation, the transition flow equation and the turbulent flow equation. In the situations where there are no viscous effects , the equation reduces to the Bernoulli equation. From experimental results, we are able to include other terms in the Bernoulli equation. We also look at cases where pressure gradients exist. We use the Modified Bernoulli equation to derive equations of flow rate for pipes of different cross sectional areas connected together. We also extend our techniques of energy conservation to a sphere falling in a viscous medium under the effect of gravity. We demonstrate Stokes equation of terminal velocity and turbulent flow equation. We look at a way of calculating the time taken for a body to fall in a viscous medium. We also look at the general equation of terminal velocity.
24. script structures
• A script is a structured representation
describing a stereotyped sequence of events in
a particular context.
• Scripts are used in natural language
understanding systems to organize a knowledge
base in terms of the situations that the system
should understand. Scripts use a frame-like
structure to represent the commonly occurring
experience like going to the movies eating in a
restaurant, shopping in a supermarket, or
visiting an ophthalmologist.
• Thus, a script is a structure that prescribes a set
of circumstances that could be expected to
follow on from one another.
PPT BY: MADHAV MISHRA 24
25. • Scripts are beneficial because:
• Events tend to occur in known runs or
patterns.
• A casual relationship between events
exist.
• An entry condition exists which allows
an event to take place.
• Prerequisites exist upon events taking
place.
PPT BY: MADHAV MISHRA 25
26. Components of a script:
• The components of a script include:
• Entry condition: These are basic condition which must
be fulfilled before events in the script can occur.
• Results: Condition that will be true after events in script
occurred.
• Props: Slots representing objects involved in events
• Roles: These are the actions that the individual
participants perform.
• Track: Variations on the script. Different tracks may
share components of the same scripts.
• Scenes: The sequence of events that occur.
PPT BY: MADHAV MISHRA 26
28. Example: Script for going to the bankto withdrawmoney.
PPT BY: MADHAV MISHRA 28
29. Advantages of Scripts
• Ability to predict events.
• A single coherent interpretation maybe builds up
from a collection of observations.
Disadvantages of Scripts
• Less general than frames.
• May not be suitable to represent all kinds of
knowledge
PPT BY: MADHAV MISHRA 29
31. • Cyc has a huge knowledge base which it uses for reasoning.
• Contains
• 15,000 predicates
• 300,000 concepts
• 3,200,000 assertions
• All these predicates, concepts and assertions are arranged in
numerous ontologies.
PPT BY: MADHAV MISHRA 31
32. Cyc: Features
Uncertain Results
• Query: “who had the motive for the assassination of
Rafik Hariri?”
• Since the case is still an unsolved political mystery, there
is no way we can ever get the answer.
• In cases like these Cyc returns the various view points,
quoting the sources from which it built its inferences.
• For the above query, it gives two view points
• “USA and Israel” as quoted from a editorial in Al
Jazeera
• “Syria” as quoted from a news report from CNN
PPT BY: MADHAV MISHRA 32
33. • It uses Google as the search engine in the background.
• It filters results according to the context of the query.
• For example, if we search for assassination of Rafik Hariri, then it
omits results which have a time stamp before that of the
assassination date.
PPT BY: MADHAV MISHRA 33
34. Qualitative Queries
• Query: “Was Bill Clinton a good President of the United
States?”
• In cases like these, Cyc returns the results in a pros and cons
type and leave it to the user to make a conclusion.
Queries With No Answer
• Query: “At this instance of time, Is Alice inhaling or Exhaling?”
• The Cyc system is intelligent enough to figure out queries
which can never be answered correctly.
PPT BY: MADHAV MISHRA 34
35. • The ultimate goal is to build enough common sense into the Cyc system
such that it can understand Natural Language.
• Once it understands Natural Language, all the system has to do is crawl
through all the online material and learn new common sense rules and
evolve.
• This two step process of building common sense and using machine
learning techniques to learn new things will make the Cyc system an
infinite source of knowledge.
PPT BY: MADHAV MISHRA 35
36. Drawbacks
• There is no single Ontology that works in all
cases.
• Although Cyc is able to simulate common sense
it cannot distinguish between facts and fiction.
• In Natural Language Processing there is no way
the Cyc system can figure out if a particular word
is used in the normal sense or in the sarcastic
sense.
• Adding knowledge is a very tedious process.
PPT BY: MADHAV MISHRA 36
37. Semantic Web
• The development of Semantic Web is well underway with a goal that it would be
possible for machines to understand the information on the web rather than
simply display.
• The major obstacle to this goal is the fact that most information on the web is
designed solely for human consumption. This information should be structured
in a way that machines can understand and process that information.
• The concept of machine-understandable documents does not imply “Artificial
Intelligence”. It only indicates a machine’s ability to solve well-defined problems
by performing well-defined operations on well-defined data.
• The key technological threads that are currently employed in the development
of Semantic Web are: eXtensible Markup Language (XML), Resource Description
Framework (RDF), DAML (DARPA Agent Markup Language).
PPT BY: MADHAV MISHRA 37
38. • Most of the web’s content today is designed for humans to read , and not for
computer programs to process meaningfully.
• Computers can
- parse the web pages.
- perform routine processing (here a header, there a link, etc.)
• In general, they have no reliable way to understand and process the semantics.
• The Semantic Web will bring structure to the meaningful content of the web of
web pages, creating an environment where software agents roaming from page to
page carry out sophisticated tasks for users.
• The Semantic Web is not a separate web
PPT BY: MADHAV MISHRA 38
39. Knowledge
Representation
• For Semantic Web to function, the computers should have access
to • Structured Collections of Information
• Meaning of this Information
• Sets of Inference Rules/Logic.
These sets of Inference rules can be used to conduct automated
reasoning.
• Technological Threads for developing the Semantic
Web:
- XML
- RDF
- Ontologies
PPT BY: MADHAV MISHRA
39
40. XML
• XML lets everyone to create their own tags.
• These tags can be used by the script programs in sophisticated ways to
perform various tasks, but the script writer has to know what the page
writer uses each tag for.
• In short, XML allows you to add arbitrary structure to the documents but
says nothing about what the structures mean.
• It has no built mechanism to convey the meaning of the user’s new tags to
other users.
40
PPT BY: MADHAV MISHRA
41. PPT BY: MADHAV MISHRA 41
• A scheme for defining information on the web. It provides the technology
for
expressing the meaning of terms and concepts in a form that computers
can
readily process.
• RDF encodes this information on the XML page in sets of triples. The
triple is an information on the web about related things.
• Each triple is a combination of Subject, Verb and Object, similar to an
elementary sentence.
• Subjects, Verbs and Objects are each identified by a URI, which enable
anyone to define a new concept/new verb just by defining a URI for it
somewhere on the web.
RDF
42. PPT BY: MADHAV MISHRA
42
These triples can be written using XML tags as shown,
RDF (contd.)
• An RDF document can make assertions that particular things (people, web
pages or whatever) have properties ( “is a sister of”, “is the author of”) with
values (another person, another person, etc.)
• RDF uses a different URI for each specific concept. Solves the problem of
same definition but different concepts. Eg. AddressTags in an XML page.
43. PPT BY: MADHAV MISHRA 43
• Ontologies are collections of statements written in a language such as RDF that define relations
between concepts and specifies logical rules for reasoning about them.
• Computers/agents/services will understand the meaning of semantic data on
a web page by following links to specified ontologies.
• Ontologies can express a large number of relationships among entities
(objects) by assigning properties to classes and allowing subclasses to inherit
such properties.
• An Ontology may express the rule,
If City Code State Code
and Address City Code then Address State Code
• Enhances the functioning of semantic web: Improves accuracy of web
searches, Easy development of programs that can tackle complicated queries.
Ontologies
47. Case Grammars
• Case grammars use the functional relationships between noun phrases and verbs
to conduct the more deeper case of a sentence
• Generally in our English sentences, the difference between different forms of a
sentence is quite negligible.
• In early 1970’s Fillmore gave some idea about different cases of a English
sentence.
• He extended the transformational grammars of Chomsky by focusing more on the
semantic aspects of view of a sentence.
• In case grammars a sentence id defined as being composed of a preposition P, a
modality constituent M, composed of mood, tense, aspect, negation and so on.
Thus we can represent a sentence like
Where P - Set of relationships among verbs and noun phrases i.e. P = (C=Case)
M - Modality constituent
PPT BY: MADHAV MISHRA
47
51. Components of NLP
• There are two components of NLP as given −
Natural Language Understanding (NLU)
• Understanding involves the following tasks −
• Mapping the given input in natural language into useful representations.
• Analysing different aspects of the language.
Natural Language Generation (NLG)
• It is the process of producing meaningful phrases and sentences in the form of
natural language from some internal representation. It involves :
• Text planning − It includes retrieving the relevant content from knowledge base.
• Sentence planning − It includes choosing required words, forming meaningful
phrases, setting tone of the sentence.
• Text Realization − It is mapping sentence plan into sentence structure.
PPT BY: MADHAV MISHRA 51
52. NLP Terminology
• Phonology − It is study of organizing sound systematically.
• Morphology − It is a study of construction of words from primitive
meaningful units.
• Syntax − It refers to arranging words to make a sentence. It also involves
determining the structural role of words in the sentence and in phrases.
• Semantics − It is concerned with the meaning of words and how to
combine words into meaningful phrases and sentences.
• Pragmatics − It deals with using and understanding sentences in different
situations and how the interpretation of the sentence is affected.
• Discourse − It deals with how the immediately preceding sentence can
affect the interpretation of the next sentence.
• World Knowledge − It includes the general knowledge about the world.
PPT BY: MADHAV MISHRA 52
53. Sentence Analysis Phases
• Lexical Analysis − It involves identifying and analyzing the
structure of words. Lexicon of a language means the collection
of words and phrases in a language. Lexical analysis is dividing
the whole chunk of txt into paragraphs, sentences, and words.
• Syntactic Analysis (Parsing) − It involves analysis of words in
the sentence for grammar and arranging words in a manner
that shows the relationship among the words. The sentence
such as “The school goes to boy” is rejected by English syntactic
analyzer.
• Semantic Analysis − It draws the exact meaning or the
dictionary meaning from the text. The text is checked for
meaningfulness. It is done by mapping syntactic structures and
objects in the task domain. The semantic analyzer disregards
sentence such as “hot ice-cream”.
PPT BY: MADHAV MISHRA 53
54. • Discourse Integration − The meaning of any sentence depends upon
the meaning of the sentence just before it. In addition, it also brings
about the meaning of immediately succeeding sentence.
• Pragmatic Analysis − During this, what was said is re-interpreted on
what it actually meant. It involves deriving those aspects of language
which require real world knowledge.
PPT BY: MADHAV MISHRA 54
55. Grammars And Parsers
• Context-Free Grammar
• It is the grammar that consists rules with a single symbol on the left-
hand side of the rewrite rules. Let us create grammar to parse a
sentence −
“The bird pecks the grains”
Articles (DETERMINER(DET)) − a | an | the
Nouns − bird | birds | grain | grains
Noun Phrase (NP) − Article + Noun | Article + Adjective + Noun
= DET N | DET ADJ N
Verbs − pecks | pecking | pecked
Verb Phrase (VP) − NP V | V NP
Adjectives (ADJ) − beautiful | small | chirping
PPT BY: MADHAV MISHRA 55
56. • The parse tree breaks down the sentence into structured parts so that
the computer can easily understand and process it. In order for the
parsing algorithm to construct this parse tree, a set of rewrite rules,
which describe what tree structures are legal, need to be constructed.
• These rules say that a certain symbol may be expanded in the tree by a
sequence of other symbols. According to first order logic rule, if there
are two strings Noun Phrase (NP) and Verb Phrase (VP), then the string
combined by NP followed by VP is a sentence. The rewrite rules for the
sentence are as follows −
• S → NP VP
• NP → DET N | DET ADJ N
• VP → V NP
PPT BY: MADHAV MISHRA 56
57. • Lexocon −
• DET → a | the
• ADJ → beautiful | perching
• N → bird | birds | grain | grains
• V → peck | pecks | pecking
• The parse tree can be created as
shown −
PPT BY: MADHAV MISHRA 57
58. PARSING PROCESS
• Parsing is the term used to describe the process
of automatically building syntactic analysis of a
sentence in terms of a given grammar and
lexicon.
• The resulting syntactic analysis may be used as
input to a process of semantic interpretation.
• Occasionally, parsing is also used to include both
syntactic and semantic analysis.
• The parsing process is done by the parser.
• The parsing performs grouping and labeling of
parts of a sentence in a way that displays their
relationships to each other in a proper way.
• The parser is a computer program which accepts
the natural language sentence as input and
generates an output structure suitable for
analysis.
PPT BY: MADHAV MISHRA 58
59. Types of Parsing
• The parsing technique can be categorized into two types such as
- Top down Parsing
- Bottom up Parsing
Top down Parsing
Top down parsing starts with the starting symbol and proceeds towards the goal. We can say
it is the process of construction the parse tree starting at the root and proceeds towards the
leaves.
It is a strategy of analyzing unknown data relationships by hypothesizing general parse tree
structures and then considering whether the known fundamental structures are compatible
with the hypothesis.
In top down parsing words of the sentence are replaced by their categories like verb phrase
(VP), Noun phrase (NP), Preposition phrase (PP), etc.
Let us consider some examples to illustrate top down parsing. We will consider both the
symbolical representation and the graphical representation. We will take the words of the
sentences and reach at the complete sentence. For parsing we will consider the previous
symbols like PP, NP, VP, ART, N, V and so on. Examples of top down parsing are LL (Left-to-
right, left most derivation), recursive descent parser etc.
PPT BY: MADHAV MISHRA 59
61. Bottom up Parsing
• In this parsing technique the process begins with the sentence and
the words of the sentence is replaced by their relevant symbols.
• It is also called shift reducing parsing.
• In bottom up parsing the construction of parse tree starts at the
leaves and proceeds towards the root.
• Bottom up parsing is a strategy for analyzing unknown data
relationships that attempts to identify the most fundamental units
first and then to infer higher order structures for them.
• This process occurs in the analysis of both natural languages and
computer languages.
• It is common for bottom up parsers to take the form of general
parsing engines that can wither parse or generate a parser for a
specific programming language given a specific of its grammar.
PPT BY: MADHAV MISHRA 61
63. Semantic Analysis
• Semantic Analysis is the process of drawing meaning from text.
• It allows computers to understand and interpret sentences, paragraphs, or
whole documents, by analysing their grammatical structure, and identifying
relationships between individual words in a particular context.
• It’s an essential sub-task of Natural Language Processing (NLP) and the
driving force behind machine learning tools like chatbots, search engines,
and text analysis.
• Semantic analysis-driven tools can help companies automatically extract
meaningful information from unstructured data, such as emails, support
tickets, and customer feedback.
PPT BY: MADHAV MISHRA 63
64. How Semantic Analysis Works
• Lexical semantics plays an important role in semantic analysis, allowing
machines to understand relationships between lexical items (words, phrasal
verbs, etc.):
• Hyponyms: specific lexical items of a generic lexical item (hypernym) e.g.
orange is a hyponym of fruit (hypernym).
• Meronomy: a logical arrangement of text and words that denotes a
constituent part of or member of something e.g., a segment of an orange
• Polysemy: a relationship between the meanings of words or phrases,
although slightly different, share a common core meaning e.g. I read a paper,
and I wrote a paper)
• Synonyms: words that have the same sense or nearly the same meaning as
another, e.g., happy, content, ecstatic, overjoyed
• Antonyms: words that have close to opposite meanings e.g., happy, sad
• Homonyms: two words that are sound the same and are spelled alike but
have a different meaning e.g., orange (color), orange (fruit)
PPT BY: MADHAV MISHRA
64
65. • Semantic analysis also takes into account signs and symbols (semiotics)
and collocations (words that often go together).
• Automated semantic analysis works with the help of machine learning
algorithms.
• By feeding semantically enhanced machine learning algorithms with
samples of text, you can train machines to make accurate predictions
based on past observations.
• There are various sub-tasks involved in a semantic-based approach for
machine learning, including word sense disambiguation and relationship
extraction:
Word Sense Disambiguation & Relationship Extraction
PPT BY: MADHAV MISHRA 65
66. Word Sense Disambiguation:
• The automated process of identifying in which sense is a word used
according to its context.
• Natural language is ambiguous and polysemic; sometimes, the same
word can have different meanings depending on how it’s used.
• The word “orange,” for example, can refer to a color, a fruit, or even a
city in Florida!
• The same happens with the word “date,” which can mean either a
particular day of the month, a fruit, or a meeting.
PPT BY: MADHAV MISHRA 66
67. • Relationship Extraction
• This task consists of detecting the semantic relationships present in a
text. Relationships usually involve two or more entities (which can be
names of people, places, company names, etc.). These entities are
connected through a semantic category, such as “works at,” “lives in,”
“is the CEO of,” “headquartered at.”
• For example, the phrase “Steve Jobs is one of the founders of Apple,
which is headquartered in California” contains two different
relationships:
PPT BY: MADHAV MISHRA 67
77. Dictionary
• Also Known as UNL Dictionary.
• It stores concepts, represented by the language words.
• It stores universal words for identifying concepts, words headings that can express concepts and
information on the syntactical behaviour.
• Each entry consists of a correspondence between a concept and a word along with information
concerning syntactic properties.
• The Grammar for defining words of the language in the dictionary is shown below
PPT BY: MADHAV MISHRA 77