This lecture introduces a graduate-level course on machine learning for natural language processing (NLP). It covers the course overview, what is NLP, applications of NLP, and challenges in the field. Key NLP tasks like part-of-speech tagging, syntactic and dependency parsing, word sense disambiguation, and coreference resolution are discussed. Machine learning algorithms commonly used for NLP like classification, regression, and neural networks are also introduced. The lecture outlines the course content, assignments including paper summaries, presentations and a final project, and expectations for students.
NLP Town's Yves Peirsman talk about how word embeddings, LSTMs/RNNs, Attention, Encoder-Decoder architectures, and more have helped NLP forward, including which challenges remain to be tackled (and some techniques to do just that).
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Fwdays
In this talk I'll start by introducing the recent breakthroughs in NLP that resulted from the combination of Transfer Learning schemes and Transformer architectures. The second part of the talk will be dedicated to an introduction of the open-source tools released by Hugging Face, in particular our transformers, tokenizers, and NLP libraries as well as our distilled and pruned models.
The document provides an overview of natural language processing (NLP) and its related areas. It discusses the classical view of NLP involving stages of processing like syntax, semantics, pragmatics, etc. It also discusses the statistical/machine learning view of NLP, where NLP tasks are framed as classification problems and cues from language help reduce uncertainty. Finally, it provides examples of lower-level NLP tasks like part-of-speech tagging that can be viewed as sequence labeling problems.
Beyond the Symbols: A 30-minute Overview of NLPMENGSAYLOEM1
This presentation delves into the world of Natural Language Processing (NLP), exploring its goal to make human language understandable to machines. The complexities of language, such as ambiguity and complex structures, are highlighted as major challenges. The talk underscores the evolution of NLP through deep learning methodologies, leading to a new era defined by large-scale language models. However, obstacles like low-resource languages and ethical issues including bias and hallucination are acknowledged as enduring challenges in the field. Overall, the presentation provides a condensed, yet comprehensive view of NLP's accomplishments and ongoing hurdles.
Filip Panjevic is a Co-Founder and CTO at ydrive.ai - startup dealing with self-driving cars, and one of the founders of Petnica Machine Learning School.
Filip's talk will focus on the story of Petnica School, how did it start, what has changed since the beginning, how the concept of school looks right now and why is that concept good for making new data scientists. This talk will be perfect for people who consider starting their careers in the data science field!
NLP Town's Yves Peirsman talk about how word embeddings, LSTMs/RNNs, Attention, Encoder-Decoder architectures, and more have helped NLP forward, including which challenges remain to be tackled (and some techniques to do just that).
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Fwdays
In this talk I'll start by introducing the recent breakthroughs in NLP that resulted from the combination of Transfer Learning schemes and Transformer architectures. The second part of the talk will be dedicated to an introduction of the open-source tools released by Hugging Face, in particular our transformers, tokenizers, and NLP libraries as well as our distilled and pruned models.
The document provides an overview of natural language processing (NLP) and its related areas. It discusses the classical view of NLP involving stages of processing like syntax, semantics, pragmatics, etc. It also discusses the statistical/machine learning view of NLP, where NLP tasks are framed as classification problems and cues from language help reduce uncertainty. Finally, it provides examples of lower-level NLP tasks like part-of-speech tagging that can be viewed as sequence labeling problems.
Beyond the Symbols: A 30-minute Overview of NLPMENGSAYLOEM1
This presentation delves into the world of Natural Language Processing (NLP), exploring its goal to make human language understandable to machines. The complexities of language, such as ambiguity and complex structures, are highlighted as major challenges. The talk underscores the evolution of NLP through deep learning methodologies, leading to a new era defined by large-scale language models. However, obstacles like low-resource languages and ethical issues including bias and hallucination are acknowledged as enduring challenges in the field. Overall, the presentation provides a condensed, yet comprehensive view of NLP's accomplishments and ongoing hurdles.
Filip Panjevic is a Co-Founder and CTO at ydrive.ai - startup dealing with self-driving cars, and one of the founders of Petnica Machine Learning School.
Filip's talk will focus on the story of Petnica School, how did it start, what has changed since the beginning, how the concept of school looks right now and why is that concept good for making new data scientists. This talk will be perfect for people who consider starting their careers in the data science field!
This document provides advice on how to prepare for the GATE CSE exam, including recommended books and study materials for each subject. It emphasizes the importance of practicing problems at the end of chapters in reference books to develop technical knowledge and the ability to apply concepts. Specific books and topics are recommended for subjects like Discrete Mathematics, Algorithms, Data Structures, Theoretical Computer Science, and Database Management Systems. It also stresses the value of video lecture courses to supplement reading. The overall message is to focus on understanding fundamental concepts and gaining problem-solving skills through extensive practice.
[KDD 2018 tutorial] End to-end goal-oriented question answering systemsQi He
End to-end goal-oriented question answering systems
version 2.0: An updated version with references of the old version (https://www.slideshare.net/QiHe2/kdd-2018-tutorial-end-toend-goaloriented-question-answering-systems).
08/22/2018: The old version was just deleted for reducing the confusion.
AIS Technical Development Workshop 2: Text Analytics with PythonNhi Nguyen
For the second TD workshop, I introduced NLP (Natural Language Processing) and Text Analytics to our AIS community. Since October is Halloween month, we decided to choose some short Halloween stories and use Python to analyze the sentiments text analytics within these stories. We also included interactive coding exercises throughout the workshop to help students get used to Python and its libraries!
Learning with limited labelled data in NLP: multi-task learning and beyondIsabelle Augenstein
When labelled training data for certain NLP tasks or languages is not readily available, different approaches exist to leverage other resources for the training of machine learning models. Those are commonly either instances from a related task or unlabelled data.
An approach that has been found to work particularly well when only limited training data is available is multi-task learning.
There, a model learns from examples of multiple related tasks at the same time by sharing hidden layers between tasks, and can therefore benefit from a larger overall number of training instances and extend the models' generalisation performance. In the related paradigm of semi-supervised learning, unlabelled data as well as labelled data for related tasks can be easily utilised by transferring labels from labelled instances to unlabelled ones in order to essentially extend the training dataset.
In this talk, I will present my recent and ongoing work in the space of learning with limited labelled data in NLP, including our NAACL 2018 papers 'Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces [1] and 'From Phonology to Syntax: Unsupervised Linguistic Typology at Different Levels with Language Embeddings’ [2].
[1] https://t.co/A5jHhFWrdw
[2] https://arxiv.org/abs/1802.09375
==========
Bio from my website http://isabelleaugenstein.github.io/index.html:
I am a tenure-track assistant professor at the University of Copenhagen, Department of Computer Science since July 2017, affiliated with the CoAStAL NLP group and work in the general areas of Statistical Natural Language Processing and Machine Learning. My main research interests are weakly supervised and low-resource learning with applications including information extraction, machine reading and fact checking.
Before starting a faculty position, I was a postdoctoral research associate in Sebastian Riedel's UCL Machine Reading group, mainly investigating machine reading from scientific articles. Prior to that, I was a Research Associate in the Sheffield NLP group, a PhD Student in the University of Sheffield Computer Science department, a Research Assistant at AIFB, Karlsruhe Institute of Technology and a Computational Linguistics undergraduate student at the Department of Computational Linguistics, Heidelberg University.
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Saurabh Kaushik
This document discusses how deep learning techniques can be applied to natural language processing tasks. It begins by explaining some of the limitations of traditional rule-based and machine learning approaches to NLP, such as the lack of semantic understanding and difficulty of feature engineering. Deep learning approaches can learn features automatically from large amounts of unlabeled text and better capture semantic and syntactic relationships between words. Recurrent neural networks are well-suited for NLP because they can model sequential data like text, and convolutional neural networks can learn hierarchical patterns in text.
Introduction to natural language processingMinh Pham
This document provides an introduction to natural language processing (NLP). It discusses what NLP is, why NLP is a difficult problem, the history of NLP, fundamental NLP tasks like word segmentation, part-of-speech tagging, syntactic analysis and semantic analysis, and applications of NLP like information retrieval, question answering, text summarization and machine translation. The document aims to give readers an overview of the key concepts and challenges in the field of natural language processing.
Natural language processing for requirements engineering: ICSE 2021 Technical...alessio_ferrari
These are the slides for the technical briefing given at ICSE 2021, given by Alessio Ferrari, Liping Zhao, and Waad Alhoshan
It covers RE tasks to which NLP is applied, an overview of a recent systematic mapping study on the topic, and a hands-on tutorial on using transfer learning for requirements classification.
Please find the links to the colab notebooks here:
https://colab.research.google.com/drive/158H-lEJE1pc-xHc1ISBAKGDHMt_eg4Gn?usp=sharing
https://colab.research.google.com/d rive/1B_5ow3rvS0Qz1y-KyJtlMNnm gmx9w3kJ?usp=sharing
https://colab.research.google.com/d rive/1Xrm0gNaa41YwlM5g2CRYYX cRvpbDnTRT?usp=sharing
ACL 2018에 다녀오신 NAVER 개발자 분들께서 그 내용을 공유해 주십니다.
1. Overview - Lucy Park
2. Tutorials - Xiaodong Gu
3. Main conference
a. Semantic parsing - Soonmin Bae
b. Dialogue - Kyungduk Kim
c. Machine translation - Zae Myung Kim
d. Summarization - Hye-Jin Min
4. Workshops - Minjoon Seo
This document provides an overview of natural language processing (NLP). It begins with examples of NLP applications like translation and question answering. It then discusses the backgrounds in artificial intelligence, linguistics, and the web. The document outlines several common NLP tasks like part-of-speech tagging, named-entity recognition, word sense disambiguation, and parsing. It also discusses challenges like ambiguity in natural language. The document concludes with a discussion of why NLP is difficult due to ambiguity at both the linguistic and acoustic levels.
This document provides an overview of natural language processing (NLP). It defines NLP, discusses common NLP tasks such as part-of-speech tagging and machine translation, and explains why NLP is challenging due to various ambiguities in natural language. The document also briefly discusses related fields like linguistics, machine learning, and information retrieval, and concludes by noting that it only covers an introduction to NLP and does not discuss solutions or the current state of the field.
This document provides an overview of representation learning techniques for natural language processing (NLP). It begins with introductions to the speakers and objectives of the workshop, which is to provide a deep dive into state-of-the-art text representation techniques. The workshop is divided into four modules: word vectors, sentence/paragraph/document vectors, and character vectors. The document provides background on why text representation is important for NLP, and discusses older techniques like one-hot encoding, bag-of-words, n-grams, and TF-IDF. It also introduces newer distributed representation techniques like word2vec's skip-gram and CBOW models, GloVe, and the use of neural networks for language modeling.
This document provides an overview of representation learning techniques for natural language processing (NLP). It begins with introducing the speakers and objectives of the workshop, which is to provide a deep dive into state-of-the-art text representation techniques and how to apply them to solve NLP problems. The workshop covers four modules: 1) archaic techniques, 2) word vectors, 3) sentence/paragraph/document vectors, and 4) character vectors. It emphasizes that representation learning is key to NLP as it transforms raw text into a numeric form that machine learning models can understand.
This document provides an overview of supervised and semi-supervised machine learning techniques for natural language processing (NLP). It begins with an introduction on why machine learning is important for NLP. It then discusses three examples of supervised learning tasks: review classification, relevance ranking, and machine translation. The remainder of the document covers supervised learning methods including generative models, discriminative models such as AdaBoost and support vector machines (SVMs), and applications to NLP problems. It concludes by mentioning opportunities to work with Microsoft on related research.
Transfer learning in NLP involves pre-training large language models on unlabeled text and then fine-tuning them on downstream tasks. Current state-of-the-art models such as BERT, GPT-2, and XLNet use bidirectional transformers pretrained using techniques like masked language modeling. These models have billions of parameters and require huge amounts of compute but have achieved SOTA results on many NLP tasks. Researchers are exploring ways to reduce model sizes through techniques like distillation while maintaining high performance. Open questions remain around model interpretability and generalization.
Technical Development Workshop - Text Analytics with PythonMichelle Purnama
In this workshop, we covered:
- How to code using Python on Jupyter Notebook
- What is NLP (Natural Language Processing) and how it is useful to analyze sentiments of a given text
- Hands-on coding exercises on how to use Python and its libraries to perform tokenization, stopwords removal, stemming and lemmatization, and POS tagging.
Gadgets pwn us? A pattern language for CALLLawrie Hunter
The document discusses creating a pattern language for computer-assisted language learning (CALL). It explores the concept of a pattern language as defined by Christopher Alexander and proposes a framework for creating a CALL pattern language in the era of web 2.0. The paper seeks to rework concepts from other fields, like "formal learning design expression" and "task arc," and have participants brainstorm elements to include through graphical challenges. The overall goal is to establish foundational patterns for CALL work.
This document provides an introduction and overview of a programming lecture. It defines programming as a precise sequence of steps to solve problems. It emphasizes that teaching programming develops critical thinking and attention to detail skills. The lecture outlines that to design a program properly, one must analyze problems, express solutions abstractly and with examples, and formulate statements precisely. It notes computers are stupid but humans are even more so. The lecture also covers programming concepts like reuse, user interfaces, comments, and logical errors. It outlines the course objectives, policy, books, and contents which include variables, control structures, functions, and more advanced C topics.
Visual-Semantic Embeddings: some thoughts on LanguageRoelof Pieters
Language technology is rapidly evolving. A resurgence in the use of distributed semantic representations and word embeddings, combined with the rise of deep neural networks has led to new approaches and new state of the art results in many natural language processing tasks. One such exciting - and most recent - trend can be seen in multimodal approaches fusing techniques and models of natural language processing (NLP) with that of computer vision.
The talk is aimed at giving an overview of the NLP part of this trend. It will start with giving a short overview of the challenges in creating deep networks for language, as well as what makes for a “good” language models, and the specific requirements of semantic word spaces for multi-modal embeddings.
This document summarizes the internship of Ho Xuan Vinh at Kyoto Institute of Technology aimed at creating a bilingual annotated corpus of Vietnamese-English for machine learning purposes. Vinh experimented with several semantic tagsets, including WordNet, LLOCE, and UCREL, but faced challenges due to the lack of Vietnamese language resources. His goal was to find an effective method for annotating a bilingual corpus to provide training data for natural language processing tasks, but he was unable to validate his annotation approaches due to limitations in the available data and tools.
I apologize, but I do not currently have capabilities for code-mixed machine translation. Some challenges include the lack of parallel corpora for training statistical or neural machine translation models. Researchers are actively working on this problem by combining monolingual models or using synthetic data.
Sentiment analysis aims to identify the orientation of opinions in text, whether positive, negative, or neutral. It draws from fields like cognitive science, natural language processing, and machine learning. Challenges include domain dependence, sarcasm, and contrasting with standard text categorization where word presence indicates category. Approaches include subjectivity detection using graph algorithms, sentiment lexicons capturing word sentiment, and scoring adjective-adverb combinations. Applications include review analysis, question answering, and developing "hate mail" filters. Future work includes exploring the cognitive perspective on sentiment analysis.
The document describes the Naive Bayes classifier. It begins with background on probabilistic classification and generative models. It then covers probability basics and the probabilistic classification rule. It introduces Naive Bayes, making the independence assumption between attributes. The Naive Bayes algorithm is described for both the learning and testing phases. An example of classifying whether to play tennis is provided. Issues with Naive Bayes like violating independence and zero probabilities are discussed. Continuous attributes modeled with normal distributions are also covered. It concludes that Naive Bayes training is easy and fast while testing is straightforward, and it often performs competitively despite its independence assumption.
This document provides advice on how to prepare for the GATE CSE exam, including recommended books and study materials for each subject. It emphasizes the importance of practicing problems at the end of chapters in reference books to develop technical knowledge and the ability to apply concepts. Specific books and topics are recommended for subjects like Discrete Mathematics, Algorithms, Data Structures, Theoretical Computer Science, and Database Management Systems. It also stresses the value of video lecture courses to supplement reading. The overall message is to focus on understanding fundamental concepts and gaining problem-solving skills through extensive practice.
[KDD 2018 tutorial] End to-end goal-oriented question answering systemsQi He
End to-end goal-oriented question answering systems
version 2.0: An updated version with references of the old version (https://www.slideshare.net/QiHe2/kdd-2018-tutorial-end-toend-goaloriented-question-answering-systems).
08/22/2018: The old version was just deleted for reducing the confusion.
AIS Technical Development Workshop 2: Text Analytics with PythonNhi Nguyen
For the second TD workshop, I introduced NLP (Natural Language Processing) and Text Analytics to our AIS community. Since October is Halloween month, we decided to choose some short Halloween stories and use Python to analyze the sentiments text analytics within these stories. We also included interactive coding exercises throughout the workshop to help students get used to Python and its libraries!
Learning with limited labelled data in NLP: multi-task learning and beyondIsabelle Augenstein
When labelled training data for certain NLP tasks or languages is not readily available, different approaches exist to leverage other resources for the training of machine learning models. Those are commonly either instances from a related task or unlabelled data.
An approach that has been found to work particularly well when only limited training data is available is multi-task learning.
There, a model learns from examples of multiple related tasks at the same time by sharing hidden layers between tasks, and can therefore benefit from a larger overall number of training instances and extend the models' generalisation performance. In the related paradigm of semi-supervised learning, unlabelled data as well as labelled data for related tasks can be easily utilised by transferring labels from labelled instances to unlabelled ones in order to essentially extend the training dataset.
In this talk, I will present my recent and ongoing work in the space of learning with limited labelled data in NLP, including our NAACL 2018 papers 'Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces [1] and 'From Phonology to Syntax: Unsupervised Linguistic Typology at Different Levels with Language Embeddings’ [2].
[1] https://t.co/A5jHhFWrdw
[2] https://arxiv.org/abs/1802.09375
==========
Bio from my website http://isabelleaugenstein.github.io/index.html:
I am a tenure-track assistant professor at the University of Copenhagen, Department of Computer Science since July 2017, affiliated with the CoAStAL NLP group and work in the general areas of Statistical Natural Language Processing and Machine Learning. My main research interests are weakly supervised and low-resource learning with applications including information extraction, machine reading and fact checking.
Before starting a faculty position, I was a postdoctoral research associate in Sebastian Riedel's UCL Machine Reading group, mainly investigating machine reading from scientific articles. Prior to that, I was a Research Associate in the Sheffield NLP group, a PhD Student in the University of Sheffield Computer Science department, a Research Assistant at AIFB, Karlsruhe Institute of Technology and a Computational Linguistics undergraduate student at the Department of Computational Linguistics, Heidelberg University.
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Saurabh Kaushik
This document discusses how deep learning techniques can be applied to natural language processing tasks. It begins by explaining some of the limitations of traditional rule-based and machine learning approaches to NLP, such as the lack of semantic understanding and difficulty of feature engineering. Deep learning approaches can learn features automatically from large amounts of unlabeled text and better capture semantic and syntactic relationships between words. Recurrent neural networks are well-suited for NLP because they can model sequential data like text, and convolutional neural networks can learn hierarchical patterns in text.
Introduction to natural language processingMinh Pham
This document provides an introduction to natural language processing (NLP). It discusses what NLP is, why NLP is a difficult problem, the history of NLP, fundamental NLP tasks like word segmentation, part-of-speech tagging, syntactic analysis and semantic analysis, and applications of NLP like information retrieval, question answering, text summarization and machine translation. The document aims to give readers an overview of the key concepts and challenges in the field of natural language processing.
Natural language processing for requirements engineering: ICSE 2021 Technical...alessio_ferrari
These are the slides for the technical briefing given at ICSE 2021, given by Alessio Ferrari, Liping Zhao, and Waad Alhoshan
It covers RE tasks to which NLP is applied, an overview of a recent systematic mapping study on the topic, and a hands-on tutorial on using transfer learning for requirements classification.
Please find the links to the colab notebooks here:
https://colab.research.google.com/drive/158H-lEJE1pc-xHc1ISBAKGDHMt_eg4Gn?usp=sharing
https://colab.research.google.com/d rive/1B_5ow3rvS0Qz1y-KyJtlMNnm gmx9w3kJ?usp=sharing
https://colab.research.google.com/d rive/1Xrm0gNaa41YwlM5g2CRYYX cRvpbDnTRT?usp=sharing
ACL 2018에 다녀오신 NAVER 개발자 분들께서 그 내용을 공유해 주십니다.
1. Overview - Lucy Park
2. Tutorials - Xiaodong Gu
3. Main conference
a. Semantic parsing - Soonmin Bae
b. Dialogue - Kyungduk Kim
c. Machine translation - Zae Myung Kim
d. Summarization - Hye-Jin Min
4. Workshops - Minjoon Seo
This document provides an overview of natural language processing (NLP). It begins with examples of NLP applications like translation and question answering. It then discusses the backgrounds in artificial intelligence, linguistics, and the web. The document outlines several common NLP tasks like part-of-speech tagging, named-entity recognition, word sense disambiguation, and parsing. It also discusses challenges like ambiguity in natural language. The document concludes with a discussion of why NLP is difficult due to ambiguity at both the linguistic and acoustic levels.
This document provides an overview of natural language processing (NLP). It defines NLP, discusses common NLP tasks such as part-of-speech tagging and machine translation, and explains why NLP is challenging due to various ambiguities in natural language. The document also briefly discusses related fields like linguistics, machine learning, and information retrieval, and concludes by noting that it only covers an introduction to NLP and does not discuss solutions or the current state of the field.
This document provides an overview of representation learning techniques for natural language processing (NLP). It begins with introductions to the speakers and objectives of the workshop, which is to provide a deep dive into state-of-the-art text representation techniques. The workshop is divided into four modules: word vectors, sentence/paragraph/document vectors, and character vectors. The document provides background on why text representation is important for NLP, and discusses older techniques like one-hot encoding, bag-of-words, n-grams, and TF-IDF. It also introduces newer distributed representation techniques like word2vec's skip-gram and CBOW models, GloVe, and the use of neural networks for language modeling.
This document provides an overview of representation learning techniques for natural language processing (NLP). It begins with introducing the speakers and objectives of the workshop, which is to provide a deep dive into state-of-the-art text representation techniques and how to apply them to solve NLP problems. The workshop covers four modules: 1) archaic techniques, 2) word vectors, 3) sentence/paragraph/document vectors, and 4) character vectors. It emphasizes that representation learning is key to NLP as it transforms raw text into a numeric form that machine learning models can understand.
This document provides an overview of supervised and semi-supervised machine learning techniques for natural language processing (NLP). It begins with an introduction on why machine learning is important for NLP. It then discusses three examples of supervised learning tasks: review classification, relevance ranking, and machine translation. The remainder of the document covers supervised learning methods including generative models, discriminative models such as AdaBoost and support vector machines (SVMs), and applications to NLP problems. It concludes by mentioning opportunities to work with Microsoft on related research.
Transfer learning in NLP involves pre-training large language models on unlabeled text and then fine-tuning them on downstream tasks. Current state-of-the-art models such as BERT, GPT-2, and XLNet use bidirectional transformers pretrained using techniques like masked language modeling. These models have billions of parameters and require huge amounts of compute but have achieved SOTA results on many NLP tasks. Researchers are exploring ways to reduce model sizes through techniques like distillation while maintaining high performance. Open questions remain around model interpretability and generalization.
Technical Development Workshop - Text Analytics with PythonMichelle Purnama
In this workshop, we covered:
- How to code using Python on Jupyter Notebook
- What is NLP (Natural Language Processing) and how it is useful to analyze sentiments of a given text
- Hands-on coding exercises on how to use Python and its libraries to perform tokenization, stopwords removal, stemming and lemmatization, and POS tagging.
Gadgets pwn us? A pattern language for CALLLawrie Hunter
The document discusses creating a pattern language for computer-assisted language learning (CALL). It explores the concept of a pattern language as defined by Christopher Alexander and proposes a framework for creating a CALL pattern language in the era of web 2.0. The paper seeks to rework concepts from other fields, like "formal learning design expression" and "task arc," and have participants brainstorm elements to include through graphical challenges. The overall goal is to establish foundational patterns for CALL work.
This document provides an introduction and overview of a programming lecture. It defines programming as a precise sequence of steps to solve problems. It emphasizes that teaching programming develops critical thinking and attention to detail skills. The lecture outlines that to design a program properly, one must analyze problems, express solutions abstractly and with examples, and formulate statements precisely. It notes computers are stupid but humans are even more so. The lecture also covers programming concepts like reuse, user interfaces, comments, and logical errors. It outlines the course objectives, policy, books, and contents which include variables, control structures, functions, and more advanced C topics.
Visual-Semantic Embeddings: some thoughts on LanguageRoelof Pieters
Language technology is rapidly evolving. A resurgence in the use of distributed semantic representations and word embeddings, combined with the rise of deep neural networks has led to new approaches and new state of the art results in many natural language processing tasks. One such exciting - and most recent - trend can be seen in multimodal approaches fusing techniques and models of natural language processing (NLP) with that of computer vision.
The talk is aimed at giving an overview of the NLP part of this trend. It will start with giving a short overview of the challenges in creating deep networks for language, as well as what makes for a “good” language models, and the specific requirements of semantic word spaces for multi-modal embeddings.
This document summarizes the internship of Ho Xuan Vinh at Kyoto Institute of Technology aimed at creating a bilingual annotated corpus of Vietnamese-English for machine learning purposes. Vinh experimented with several semantic tagsets, including WordNet, LLOCE, and UCREL, but faced challenges due to the lack of Vietnamese language resources. His goal was to find an effective method for annotating a bilingual corpus to provide training data for natural language processing tasks, but he was unable to validate his annotation approaches due to limitations in the available data and tools.
I apologize, but I do not currently have capabilities for code-mixed machine translation. Some challenges include the lack of parallel corpora for training statistical or neural machine translation models. Researchers are actively working on this problem by combining monolingual models or using synthetic data.
Sentiment analysis aims to identify the orientation of opinions in text, whether positive, negative, or neutral. It draws from fields like cognitive science, natural language processing, and machine learning. Challenges include domain dependence, sarcasm, and contrasting with standard text categorization where word presence indicates category. Approaches include subjectivity detection using graph algorithms, sentiment lexicons capturing word sentiment, and scoring adjective-adverb combinations. Applications include review analysis, question answering, and developing "hate mail" filters. Future work includes exploring the cognitive perspective on sentiment analysis.
The document describes the Naive Bayes classifier. It begins with background on probabilistic classification and generative models. It then covers probability basics and the probabilistic classification rule. It introduces Naive Bayes, making the independence assumption between attributes. The Naive Bayes algorithm is described for both the learning and testing phases. An example of classifying whether to play tennis is provided. Issues with Naive Bayes like violating independence and zero probabilities are discussed. Continuous attributes modeled with normal distributions are also covered. It concludes that Naive Bayes training is easy and fast while testing is straightforward, and it often performs competitively despite its independence assumption.
Here are the key steps in building a spoken dialogue system:
1. Speech recognition: Convert speech to text using acoustic and language models.
2. Natural language understanding: Parse text and extract meaning using NLP techniques like parsing.
3. Dialogue management: Maintain context of conversation and decide system responses using dialogue state tracking and policy models.
4. Natural language generation: Convert system responses to text using templates and/or data-to-text models.
5. Speech synthesis: Convert text to speech using text-to-speech models.
The goal is smooth turn-taking conversation where the system understands the user's intent and responds appropriately through speech. Challenges include speech recognition errors,
This document discusses decision support, data warehousing, and online analytical processing (OLAP). It defines these terms and compares online transaction processing (OLTP) to OLAP. It describes the evolution of decision support from batch reports to integrated data warehouses. The benefits of separating data warehouses from operational databases are outlined. Common architectures and the design/operational process are summarized.
The document discusses the ETL (Extract-Transform-Load) process used in data warehousing. The key steps of ETL are: 1) Capture/Extract which obtains data from source systems, 2) Scrub/Cleanse which cleans and formats the data, 3) Transform which converts the data to the structure of the data warehouse, and 4) Load/Index which loads the transformed data into the warehouse where it can be indexed. The ETL process results in data in the warehouse that is detailed, historical, normalized, comprehensive, timely, and quality controlled.
The document summarizes a lecture on decision trees for classification. It introduces decision trees as a non-linear classification approach that learns directly from data representations. It then describes the basic ID3 algorithm for learning decision trees in a greedy top-down manner by choosing attributes that best split the data based on information gain at each step, recursively building the tree until reaching leaf nodes of single target classifications. The goal is to learn a compact tree representation of the training data.
Build applications with generative AI on Google CloudMárton Kodok
We will explore Vertex AI - Model Garden powered experiences, we are going to learn more about the integration of these generative AI APIs. We are going to see in action what the Gemini family of generative models are for developers to build and deploy AI-driven applications. Vertex AI includes a suite of foundation models, these are referred to as the PaLM and Gemini family of generative ai models, and they come in different versions. We are going to cover how to use via API to: - execute prompts in text and chat - cover multimodal use cases with image prompts. - finetune and distill to improve knowledge domains - run function calls with foundation models to optimize them for specific tasks. At the end of the session, developers will understand how to innovate with generative AI and develop apps using the generative ai industry trends.
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
Open Source Contributions to Postgres: The Basics POSETTE 2024ElizabethGarrettChri
Postgres is the most advanced open-source database in the world and it's supported by a community, not a single company. So how does this work? How does code actually get into Postgres? I recently had a patch submitted and committed and I want to share what I learned in that process. I’ll give you an overview of Postgres versions and how the underlying project codebase functions. I’ll also show you the process for submitting a patch and getting that tested and committed.
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
2. Announcements
Waiting list: Start attending the first few lectures
as if you are registered. Given that some students
will drop the class, some space will free up.
We will use Piazza as an online discussion
platform. Please sign up here:
piazza.com/ucla/fall2017/cs269
ML in NLP 2
4. This lecture
Course Overview
What is NLP? Why it is important?
What types of ML methods used in NLP?
What will you learn from this course?
Course Information
What are the challenges?
Key NLP components
Key ML ideas in NLP
ML in NLP 4
5. What is NLP
Wiki: Natural language processing (NLP) is
a field of computer science, artificial
intelligence, and computational linguistics
concerned with the interactions between
computers and human (natural) languages.
ML in NLP 5
6. Go beyond the keyword matching
Identify the structure and meaning of
words, sentences, texts and conversations
Deep understanding of broad language
NLP is all around us
ML in NLP 6
15. Digital personal assistant
Semantic parsing – understand tasks
Entity linking – “my wife” = “Kellie” in the phone
book
ML in NLP 15
credit: techspot.com
More on natural language instruction
17. Language Comprehension
Q: who wrote Winnie the Pooh?
Q: where is Chris lived?
ML in NLP 17
Christopher Robin is alive and well. He is the same
person that you read about in the book, Winnie the Pooh.
As a boy, Chris lived in a pretty home called Cotchfield
Farm. When Chris was three years old, his father wrote
a poem about him. The poem was printed in a magazine
for others to read. Mr. Robin then wrote a book
18. What will you learn from this course
The NLP Pipeline
Key components for
understanding text
NLP systems/applications
Current techniques & limitation
Build realistic NLP tools
ML in NLP 18
19. What’s not covered by this course
Speech recognition – no signal processing
Natural language generation
Details of ML algorithms / theory
Text mining / information retrieval
ML in NLP 19
20. This lecture
Course Overview
What is NLP? Why it is important?
What will you learn from this course?
Course Information
What are the challenges?
Key NLP components
ML in NLP 20
21. Overview
New course, first time being offered
Comments are welcomed
target at first- or second- year PhD students
Lecture + Seminar
No course prerequisites, but I assume
programming experience (for the final project)
basic ML/AI background
basics of probability calculus, and linear
algebra (HW0)
ML in NLP 21
22. Grading
Attendance & participations (10%)
Participate in discussion
Paper summarization report (20%)
Paper presentation (30%)
Final project (40%)
Proposal (5%)
Final Paper (25%)
Presentation (10%)
ML in NLP 22
23. Paper summarization
1 page maximum
Pick one paper from recent
ACL/NAACL/EMNLP/EACL
Summarize the paper (use you own words)
Write a blog post using markdown or jupyter
notebook:
https://einstein.ai/research/learned-in-
translation-contextualized-word-vectors
https://github.com/uclanlp/reducingbias/blob/ma
ster/src/fairCRF_gender_ratio.ipynb
ML in NLP 23
26. Paper presentation
Each group has 2~3 students
Read and understand 2~3 related papers
Cannot be the same as your paper summary
Can be related to your final project
Register your choice next week
30 min presentation/ Q&A
Grading Rubric:
40% technical understanding, 40%
presentation, 20% interaction
ML in NLP 26
27. Final Project
Work in groups (3 students)
Project proposal
1 page maximum (template)
Project report
Similar to the paper summary
Due before the final presentation
Project presentation
in-class presentation (tentative)
ML in NLP 27
28. Late Policy
Submission site will be closed 1hr after the
deadline.
No late submission
unless under emergency situation
ML in NLP 28
29. Cheating/Plagiarism
No. Ask if you have concerns
Rules of thumb:
Cite your references
Clearly state what are your contributions
ML in NLP 29
30. Lectures and office hours
Participation is highly appreciated!
Ask questions if you are still confusing
Feedbacks are welcomed
Lead the discussion in this class
Enroll Piazza
ML in NLP 30
31. Topics of this class
Fundamental NLP problems
Machine learning & statistical approaches
for NLP
NLP applications
Recent trends in NLP
ML in NLP 31
32. What to Read?
Natural Language Processing
ACL, NAACL, EACL, EMNLP, CoNLL, Coling, TACL
aclweb.org/anthology
Machine learning
ICML, NIPS, ECML, AISTATS, ICLR, JMLR, MLJ
Artificial Intelligence
AAAI, IJCAI, UAI, JAIR
ML in NLP 32
34. This lecture
Course Overview
What is NLP? Why it is important?
What will you learn from this course?
Course Information
What are the challenges?
Key NLP components
Key ML ideas in NLP
ML in NLP 34
36. Challenges – ambiguity
Word sense / meaning ambiguity
ML in NLP 36
Credit: http://stuffsirisaid.com
37. Challenges – ambiguity
PP attachment ambiguity
ML in NLP 37
Credit: Mark Liberman, http://languagelog.ldc.upenn.edu/nll/?p=17711
38. Challenges -- ambiguity
Ambiguous headlines:
Include your children when baking cookies
Local High School Dropouts Cut in Half
Hospitals are Sued by 7 Foot Doctors
Iraqi Head Seeks Arms
Safety Experts Say School Bus Passengers
Should Be Belted
Teacher Strikes Idle Kids
ML in NLP 38
39. Challenges – ambiguity
Pronoun reference ambiguity
ML in NLP 39
Credit: http://www.printwand.com/blog/8-catastrophic-examples-of-word-choice-mistakes
40. Challenges – language is not static
Language grows and changes
e.g., cyber lingo
ML in NLP 40
LOL Laugh out loud
G2G Got to go
BFN Bye for now
B4N Bye for now
Idk I don’t know
FWIW For what it’s worth
LUWAMH Love you with all my heart
43. Challenges – scale
Examples:
Bible (King James version): ~700K
Penn Tree bank ~1M from Wall street journal
Newswire collection: 500M+
Wikipedia: 2.9 billion word (English)
Web: several billions of words
ML in NLP 43
44. This lecture
Course Overview
What is NLP? Why it is important?
What will you learn from this course?
Course Information
What are the challenges?
Key NLP components
Key ML ideas in NLP
ML in NLP 44
49. Semantic analysis
Word sense disambiguation
Semantic role labeling
ML in NLP 49
Credit: Ivan Titov
50. Christopher Robin is alive and well. He is the
same person that you read about in the book,
Winnie the Pooh. As a boy, Chris lived in a
pretty home called Cotchfield Farm. When
Chris was three years old, his father wrote a
poem about him. The poem was printed in a
magazine for others to read. Mr. Robin then
wrote a book
50
Q: [Chris] = [Mr. Robin] ?
Slide modified from Dan Roth
ML in NLP
51. Christopher Robin is alive and well. He is the
same person that you read about in the book,
Winnie the Pooh. As a boy, Chris lived in a
pretty home called Cotchfield Farm. When
Chris was three years old, his father wrote a
poem about him. The poem was printed in a
magazine for others to read. Mr. Robin then
wrote a book
51
Co-reference Resolution
ML in NLP
52. This lecture
Course Overview
What is NLP? Why it is important?
What will you learn from this course?
Course Information
What are the challenges?
Key NLP components
Key ML ideas in NLP
ML in NLP 52
56. Classification is generally well-understood
Theoretically: generalization bound
# examples to train a good model
Algorithmically:
Efficient algorithm for large data set
E.g., take a few second to train a linear SVM on
data with millions instances and features
Algorithms for non-linear model
E.g., Kernel methods
CS6501- Advanced Machine Learning 56
Is this enough to solve all real-world problems?
58. Challenges
Modeling challenges
How to model a complex decision?
Representation challenges
How to extract features?
Algorithmic challenges
Large amount of data and complex decision structure
58
Bill Clinton, recently elected as the President of
the USA, has been invited by the Russian
President], [Vladimir Putin, to visit Russia.
President Clinton said that he looks forward to
strengthening ties between USA and Russia
Algorithm 2 is shown to perform
better Berg-Kirkpatrick, ACL
2010. It can also be expected to
converge faster -- anyway, the E-
step changes the auxiliary
function by changing the
expected counts, so there's no
point in finding a local maximum
of the auxiliary
function in each iteration
a local-optimality guarantee.
Consequently, LOLS can
improve upon the reference
policy, unlike previous
algorithms. This enables us to
develop structured contextual
bandits, a partial information
structured prediction setting with
many potential applications.
Can learning to search work even
when the reference is poor?
We provide a new learning to
search algorithm, LOLS, which
does well relative to the
reference policy, but additionally
guarantees low regret compared
to deviations from the learned
policy.
Methods for learning to search
for structured prediction typically
imitate a reference policy, with
existing theoretical guarantees
demonstrating low regret
compared to that reference. This
is unsatisfactory in many
applications where the reference
policy is suboptimal and the goal
of learning is to
Robin is alive and well. He is the
same person that you read about
in the book, Winnie the Pooh. As
a boy, Chris lived in a pretty
home called Cotchfield Farm.
When Chris was three years old,
his father wrote a poem about
him. The poem was printed in a
magazine for others to read. Mr.
Robin then wrote a book
Robin is alive and well. He is the same
person that you read about in the book,
Winnie the Pooh. As a boy, Chris lived
in a pretty home called Cotchfield
Farm. When Chris was three years old,
his father wrote a poem about him. The
poem was printed in a magazine for
others to read. Mr. Robin then wrote a
book
Structured prediction models
Deep learning models
Inference / learning algorithms
59. Modeling Challenges
How to model a complex decision?
Why this is important?
CS6501- Advanced Machine Learning 59
Robin is alive and well. He is the same
person that you read about in the book,
Winnie the Pooh. As a boy, Chris lived
in a pretty home called Cotchfield
Farm. When Chris was three years old,
his father wrote a poem about him. The
poem was printed in a magazine for
others to read. Mr. Robin then wrote a
book
65. Bridge the gap
Simple classifiers are not designed for handle
complex output
Need to make multiple decisions jointly
Example: POS tagging:
CS6501- Advanced Machine Learning 65
Example from Vivek Srikumar
can you can a can as a canner can can a can
66. Make multiple decisions jointly
Example: POS tagging:
Each part needs a label
Assign tag (V., N., A., …) to each word in the sentence
The decisions are mutually dependent
Cannot have verb followed by a verb
Results are evaluated jointly
CS6501- Advanced Machine Learning 66
can you can a can as a canner can can a can
67. Structured prediction problems
Problems that
have multiple interdependent output variables
and the output assignments are evaluated jointly
Need a joint assignment to all the output variables
We called it joint inference, global infernece or
simply inference
CS6501- Advanced Machine Learning 67
68. Input: 𝑥 ∈ 𝑋
Truth: y∗
∈ 𝑌(𝑥)
Predicted: ℎ(𝑥) ∈ 𝑌(𝑥)
Loss: 𝑙𝑜𝑠𝑠 𝑦, 𝑦∗
68
I can can a can
Pro Md Vb Dt Nn
Pro Md Nn Dt Vb
Pro Md Nn Dt Md
Pro Md Md Dt Nn
Pro Md Md Dt Vb
Goal: make joint prediction to minimize a joint loss
find ℎ ∈ 𝐻 such that ℎ x ∈ 𝑌(𝑋)
minimizing 𝐸 𝑥,𝑦 ~𝐷 𝑙𝑜𝑠𝑠 𝑦, ℎ 𝑥 based on 𝑁
samples 𝑥𝑛, 𝑦𝑛 ~𝐷
Kai-Wei Chang (University of Virginia)
A General learning setting
69. Input: 𝑥 ∈ 𝑋
Truth: y∗
∈ 𝑌(𝑥)
Predicted: ℎ(𝑥) ∈ 𝑌(𝑥)
Loss: 𝑙𝑜𝑠𝑠 𝑦, 𝑦∗
69
I can can a can
Pro Md Vb Dt Nn
Pro Md Nn Dt Vb
Pro Md Nn Dt Md
Pro Md Md Dt Nn
Pro Md Md Dt Vb
# POS tags: 45
How many possible outputs for sentence with 10
words?
Kai-Wei Chang (University of Virginia)
Combinatorial output space
4510
= 3.4 × 1016
Observation: Not all sequences are valid,
and we don’t need to consider all of them
70. Representation of interdependent output variables
A compact way to represent output
combinations
Abstract away unnecessary complexities
We know how to process them
Graph algorithms for linear chain, tree, etc.
CS6501- Advanced Machine Learning 70
Pronoun Verb Noun And Noun
Root They operate ships and banks .
71. Algorithms/models for structured prediction
Many learning algorithms can be
generalized to the structured case
Perceptron → Structured perceptron
SVM → Structured SVM
Logistic regression → Conditional random field
(a.k.a. log-linear models)
Can be solved by a reduction stack
Structured prediction → multi-class → binary
CS6501- Advanced Machine Learning 71
72. Representation Challenges
How to obtain features?
CS6501- Advanced Machine Learning 72
Robin is alive and well. He is the same person that
you read about in the book, Winnie the Pooh. As a
boy, Chris lived in a pretty home called Cotchfield
Farm. When Chris was three years old, his father
wrote a poem about him. The poem was printed in a
magazine for others to read. Mr. Robin then wrote a
book
73. Representation Challenges
How to obtain features?
1. Design features based on domain knowledge
E.g., by patterns in parse trees
By nicknames
Need human experts/knowledge
CS6501- Advanced Machine Learning 73
When Chris was three years old, his father wrote a poem about him.
Christopher Robin is alive and well. He is the same person that you read
about in the book, Winnie the Pooh. As a boy, Chris lived in a pretty home
called Cotchfield Farm.
74. Representation Challenges
How to obtain features?
1. Design features based on domain knowledge
2. Design feature templates and then let machine
find the right ones
E.g., use all words, pairs of words, …
CS6501- Advanced Machine Learning 74
Robin is alive and well. He is the same person that you read about
in the book, Winnie the Pooh. As a boy, Chris lived in a pretty
home called Cotchfield Farm. When Chris was three years old, his
father wrote a poem about him. The poem was printed in a
magazine for others to read. Mr. Robin then wrote a book
75. Representation Challenges
How to obtain features?
1. Design features based on domain knowledge
2. Design feature templates and then let machine
find the right ones
Challenges:
# featuers can be very large
# English words: 171K (Oxford)
# Bigram: 171𝐾 2~3 × 1010, # trigram?
For some domains, it is hard to design features
CS6501- Advanced Machine Learning 75
76. Representation learning
Learn compact representations of features
Combinatorial (continuous representation)
CS6501- Advanced Machine Learning 76
77. Representation learning
Learn compact representations of features
Combinatorial (continuous representation)
Hieratical/compositional
CS6501- Advanced Machine Learning 77
78. What will learn from this course
Structured prediction
Models / inference/ learning
Representation (deep) learning
Input/output representations
Combining structured models and deep
learning
CS6501- Advanced Machine Learning 78