This deck covers the problem of fine-tuning a pre-trained BERT model for the task of Question Answering. Check out the GluonNLP model zoo here for models and tutorials: http://gluon-nlp.mxnet.io/model_zoo/bert/index.html
Slides: Thomas Delteil
Question Answering System using machine learning approachGarima Nanda
In a compact form, this is a presentation reflecting how the machine learning approach can be used for the effective and efficient interaction using classification techniques.
The document discusses question answering over knowledge graphs. It introduces question answering and describes how knowledge graphs can be used to answer natural language questions. It summarizes three proposed papers on learning knowledge graphs for question answering through dialogs, automated template generation for question answering over knowledge graphs, and generating knowledge questions from knowledge graphs. The document also covers motivation for question answering, defining characteristics, different methods like template-based and dialog-based systems, evaluating knowledge quality, and examples of question answering systems.
Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...Edureka!
** NLP Using Python: - https://www.edureka.co/python-natural-language-processing-course **
This Edureka PPT will provide you with a comprehensive and detailed knowledge of Natural Language Processing, popularly known as NLP. You will also learn about the different steps involved in processing the human language like Tokenization, Stemming, Lemmatization and much more along with a demo on each one of the topics.
The following topics covered in this PPT:
1. The Evolution of Human Language
2. What is Text Mining?
3. What is Natural Language Processing?
4. Applications of NLP
5. NLP Components and Demo
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)Sergey Karayev
This document discusses a lecture on transfer learning and transformers. It begins with an outline of topics to be covered, including transfer learning in computer vision, embeddings and language models, ELMO/ULMFit as "NLP's ImageNet Moment", transformers, attention in detail, and BERT, GPT-2, DistillBERT and T5. It then goes on to provide slides and explanations on these topics, discussing how transfer learning works, word embeddings, language models like Word2Vec, ELMO, ULMFit, the transformer architecture, attention mechanisms, and prominent transformer models.
最近のNLP×DeepLearningのベースになっている"Transformer"について、研究室の勉強会用に作成した資料です。参考資料の引用など正確を期したつもりですが、誤りがあれば指摘お願い致します。
This is a material for the lab seminar about "Transformer", which is the base of recent NLP x Deep Learning research.
The document provides an overview of question answering systems, including their evolution from information retrieval, common evaluation benchmarks like TREC and CLEF, and examples of major QA projects like Watson. It also discusses the movement towards leveraging semantic technologies and linked open data to power next generation QA systems, as seen in projects like SINA which transform natural language queries into formal queries over structured knowledge bases.
Question Answering System using machine learning approachGarima Nanda
In a compact form, this is a presentation reflecting how the machine learning approach can be used for the effective and efficient interaction using classification techniques.
The document discusses question answering over knowledge graphs. It introduces question answering and describes how knowledge graphs can be used to answer natural language questions. It summarizes three proposed papers on learning knowledge graphs for question answering through dialogs, automated template generation for question answering over knowledge graphs, and generating knowledge questions from knowledge graphs. The document also covers motivation for question answering, defining characteristics, different methods like template-based and dialog-based systems, evaluating knowledge quality, and examples of question answering systems.
Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...Edureka!
** NLP Using Python: - https://www.edureka.co/python-natural-language-processing-course **
This Edureka PPT will provide you with a comprehensive and detailed knowledge of Natural Language Processing, popularly known as NLP. You will also learn about the different steps involved in processing the human language like Tokenization, Stemming, Lemmatization and much more along with a demo on each one of the topics.
The following topics covered in this PPT:
1. The Evolution of Human Language
2. What is Text Mining?
3. What is Natural Language Processing?
4. Applications of NLP
5. NLP Components and Demo
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)Sergey Karayev
This document discusses a lecture on transfer learning and transformers. It begins with an outline of topics to be covered, including transfer learning in computer vision, embeddings and language models, ELMO/ULMFit as "NLP's ImageNet Moment", transformers, attention in detail, and BERT, GPT-2, DistillBERT and T5. It then goes on to provide slides and explanations on these topics, discussing how transfer learning works, word embeddings, language models like Word2Vec, ELMO, ULMFit, the transformer architecture, attention mechanisms, and prominent transformer models.
最近のNLP×DeepLearningのベースになっている"Transformer"について、研究室の勉強会用に作成した資料です。参考資料の引用など正確を期したつもりですが、誤りがあれば指摘お願い致します。
This is a material for the lab seminar about "Transformer", which is the base of recent NLP x Deep Learning research.
The document provides an overview of question answering systems, including their evolution from information retrieval, common evaluation benchmarks like TREC and CLEF, and examples of major QA projects like Watson. It also discusses the movement towards leveraging semantic technologies and linked open data to power next generation QA systems, as seen in projects like SINA which transform natural language queries into formal queries over structured knowledge bases.
The document discusses the BERT model for natural language processing. It begins with an introduction to BERT and how it achieved state-of-the-art results on 11 NLP tasks in 2018. The document then covers related work on language representation models including ELMo and GPT. It describes the key aspects of the BERT model, including its bidirectional Transformer architecture, pre-training using masked language modeling and next sentence prediction, and fine-tuning for downstream tasks. Experimental results are presented showing BERT outperforming previous models on the GLUE benchmark, SQuAD 1.1, SQuAD 2.0, and SWAG. Ablation studies examine the importance of the pre-training tasks and the effect of model size.
A Comprehensive Review of Large Language Models for.pptxSaiPragnaKancheti
The document presents a review of large language models (LLMs) for code generation. It discusses different types of LLMs including left-to-right, masked, and encoder-decoder models. Existing models for code generation like Codex, GPT-Neo, GPT-J, and CodeParrot are compared. A new model called PolyCoder with 2.7 billion parameters trained on 12 programming languages is introduced. Evaluation results show PolyCoder performs less well than comparably sized models but outperforms others on C language tasks. In general, performance improves with larger models and longer training, but training solely on code can be sufficient or advantageous for some languages.
GPT-2: Language Models are Unsupervised Multitask LearnersYoung Seok Kim
This document summarizes a technical paper about GPT-2, an unsupervised language model created by OpenAI. GPT-2 is a transformer-based model trained on a large corpus of internet text using byte-pair encoding. The paper describes experiments showing GPT-2 can perform various NLP tasks like summarization, translation, and question answering with limited or no supervision, though performance is still below supervised models. It concludes that unsupervised task learning is a promising area for further research.
Using Text Embeddings for Information RetrievalBhaskar Mitra
Neural text embeddings provide dense vector representations of words and documents that encode various notions of semantic relatedness. Word2vec models typical similarity by representing words based on neighboring context words, while models like latent semantic analysis encode topical similarity through co-occurrence in documents. Dual embedding spaces can separately model both typical and topical similarities. Recent work has applied text embeddings to tasks like query auto-completion, session modeling, and document ranking, demonstrating their ability to capture semantic relationships between text beyond just words.
Neural Language Generation Head to Toe Hady Elsahar
This is a gentle introduction to Natural language Generation (NLG) using deep learning. If you are a computer science practitioner with basic knowledge about Machine learning. This is a gentle intuitive introduction to Language Generation using Neural Networks. It takes you in a journey from the basic intuitions behind modeling language and how to model probabilities of sequences to recurrent neural networks to large Transformers models that you have seen in the news like GPT2/GPT3. The tutorial wraps up with a summary on the ethical implications of training such large language models on uncurated text from the internet.
Presented by Ted Xiao at RobotXSpace on 4/18/2017. This workshop covers the fundamentals of Natural Language Processing, crucial NLP approaches, and an overview of NLP in industry.
Word embedding, Vector space model, language modelling, Neural language model, Word2Vec, GloVe, Fasttext, ELMo, BERT, distilBER, roBERTa, sBERT, Transformer, Attention
Natural language processing and transformer modelsDing Li
The document discusses several approaches for text classification using machine learning algorithms:
1. Count the frequency of individual words in tweets and sum for each tweet to create feature vectors for classification models like regression. However, this loses some word context information.
2. Use Bayes' rule and calculate word probabilities conditioned on class to perform naive Bayes classification. Laplacian smoothing is used to handle zero probabilities.
3. Incorporate word n-grams and context by calculating word probabilities within n-gram contexts rather than independently. This captures more linguistic information than the first two approaches.
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingYoung Seok Kim
Review of paper
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
ArXiv link: https://arxiv.org/abs/1810.04805
YouTube Presentation: https://youtu.be/GK4IO3qOnLc
(Slides are written in English, but the presentation is done in Korean)
Building, Evaluating, and Optimizing your RAG App for ProductionSri Ambati
The document discusses optimizing question answering systems called RAG (Retrieve-and-Generate) stacks. It outlines challenges with naive RAG approaches and proposes solutions like improved data representations, advanced retrieval techniques, and fine-tuning large language models. Table stakes optimizations include tuning chunk sizes, prompt engineering, and customizing LLMs. More advanced techniques involve small-to-big retrieval, multi-document agents, embedding fine-tuning, and LLM fine-tuning.
1) Transformers use self-attention to solve problems with RNNs like vanishing gradients and parallelization. They combine CNNs and attention.
2) Transformers have encoder and decoder blocks. The encoder models input and decoder models output. Variations remove encoder (GPT) or decoder (BERT) for language modeling.
3) GPT-3 is a large Transformer with 175B parameters that can perform many NLP tasks but still has safety and bias issues.
This document provides an overview of building, evaluating, and optimizing a RAG (Retrieve-and-Generate) conversational agent for production. It discusses setting up the development environment, prototyping the initial system, addressing challenges when moving to production like latency, costs, and quality issues. It also covers approaches for systematically evaluating the system, including using LLMs as judges, and experimenting and optimizing components like retrieval and generation through configuration tuning, model fine-tuning, and customizing the pipeline.
Question Answering - Application and ChallengesJens Lehmann
This document provides an overview of question answering applications and challenges. It defines question answering as receiving natural language questions and providing concise answers. Recent developments in question answering systems are discussed, including IBM Watson. Challenges for question answering over semantic data are explored, such as lexical gaps, ambiguity, granularity, and alternative resources. Large-scale linguistic resources and machine learning approaches for question answering are also covered. Applications of question answering technologies are examined.
This document discusses neural network models for natural language processing tasks like machine translation. It describes how recurrent neural networks (RNNs) were used initially but had limitations in capturing long-term dependencies and parallelization. The encoder-decoder framework addressed some issues but still lost context. Attention mechanisms allowed focusing on relevant parts of the input and using all encoded states. Transformers replaced RNNs entirely with self-attention and encoder-decoder attention, allowing parallelization while generating a richer representation capturing word relationships. This revolutionized NLP tasks like machine translation.
An introduction to the Transformers architecture and BERTSuman Debnath
The transformer is one of the most popular state-of-the-art deep (SOTA) learning architectures that is mostly used for natural language processing (NLP) tasks. Ever since the advent of the transformer, it has replaced RNN and LSTM for various tasks. The transformer also created a major breakthrough in the field of NLP and also paved the way for new revolutionary architectures such as BERT.
Regulating Generative AI - LLMOps pipelines with TransparencyDebmalya Biswas
The growing adoption of Gen AI, esp. LLMs, has re-ignited the discussion around AI Regulations — to ensure that AI/ML systems are responsibly trained and deployed. Unfortunately, this effort is complicated by multiple governmental organizations and regulatory bodies releasing their own guidelines and policies with little to no agreement on the definition of terms.
Rather than trying to understand and regulate all types of AI, we recommend a different (and practical) approach in this talk based on AI Transparency —
to transparently outline the capabilities of the AI system based on its training methodology and set realistic expectations with respect to what it can (and cannot) do.
We outline LLMOps architecture patterns and show how the proposed approach can be integrated at different stages of the LLMOps pipeline capturing the model's capabilities. In addition, the AI system provider also specifies scenarios where (they believe that) the system can make mistakes, and recommends a ‘safe’ approach with guardrails for those scenarios.
This document provides an overview of natural language processing (NLP). It discusses how NLP allows computers to understand human language through techniques like speech recognition, text analysis, and language generation. The document outlines the main components of NLP including natural language understanding and natural language generation. It also describes common NLP tasks like part-of-speech tagging, named entity recognition, and dependency parsing. Finally, the document explains how to build an NLP pipeline by applying these techniques in a sequential manner.
#AI + #ML + #Robotics combination is a game-changer, so #ServerlessTO members were lucky to have Alex Barbosa Coqueiro - Public Sector Solutions Architect Manager at AWS Canada, introduce us to AWS Robomaker & AWS DeepRacer!
Alex also talked about managed #ReinforcementLearning (RL) with Amazon SageMaker, and compared DeepRacer to Donkey Car – open source project for small-scale self-driving cars.
Video at: https://youtu.be/t8bo9gOveoo
The document discusses building a minimum viable product (MVP) on AWS. It covers what an MVP is, development iterations using sprints and standups, continuously shipping releases, prioritizing tasks, avoiding anti-patterns like over-engineering, and using AWS services like Elastic Beanstalk, Lambda, API Gateway for deploying monoliths or microservices. It also discusses data models and common AWS services for relational, document, graph and other databases.
The document discusses the BERT model for natural language processing. It begins with an introduction to BERT and how it achieved state-of-the-art results on 11 NLP tasks in 2018. The document then covers related work on language representation models including ELMo and GPT. It describes the key aspects of the BERT model, including its bidirectional Transformer architecture, pre-training using masked language modeling and next sentence prediction, and fine-tuning for downstream tasks. Experimental results are presented showing BERT outperforming previous models on the GLUE benchmark, SQuAD 1.1, SQuAD 2.0, and SWAG. Ablation studies examine the importance of the pre-training tasks and the effect of model size.
A Comprehensive Review of Large Language Models for.pptxSaiPragnaKancheti
The document presents a review of large language models (LLMs) for code generation. It discusses different types of LLMs including left-to-right, masked, and encoder-decoder models. Existing models for code generation like Codex, GPT-Neo, GPT-J, and CodeParrot are compared. A new model called PolyCoder with 2.7 billion parameters trained on 12 programming languages is introduced. Evaluation results show PolyCoder performs less well than comparably sized models but outperforms others on C language tasks. In general, performance improves with larger models and longer training, but training solely on code can be sufficient or advantageous for some languages.
GPT-2: Language Models are Unsupervised Multitask LearnersYoung Seok Kim
This document summarizes a technical paper about GPT-2, an unsupervised language model created by OpenAI. GPT-2 is a transformer-based model trained on a large corpus of internet text using byte-pair encoding. The paper describes experiments showing GPT-2 can perform various NLP tasks like summarization, translation, and question answering with limited or no supervision, though performance is still below supervised models. It concludes that unsupervised task learning is a promising area for further research.
Using Text Embeddings for Information RetrievalBhaskar Mitra
Neural text embeddings provide dense vector representations of words and documents that encode various notions of semantic relatedness. Word2vec models typical similarity by representing words based on neighboring context words, while models like latent semantic analysis encode topical similarity through co-occurrence in documents. Dual embedding spaces can separately model both typical and topical similarities. Recent work has applied text embeddings to tasks like query auto-completion, session modeling, and document ranking, demonstrating their ability to capture semantic relationships between text beyond just words.
Neural Language Generation Head to Toe Hady Elsahar
This is a gentle introduction to Natural language Generation (NLG) using deep learning. If you are a computer science practitioner with basic knowledge about Machine learning. This is a gentle intuitive introduction to Language Generation using Neural Networks. It takes you in a journey from the basic intuitions behind modeling language and how to model probabilities of sequences to recurrent neural networks to large Transformers models that you have seen in the news like GPT2/GPT3. The tutorial wraps up with a summary on the ethical implications of training such large language models on uncurated text from the internet.
Presented by Ted Xiao at RobotXSpace on 4/18/2017. This workshop covers the fundamentals of Natural Language Processing, crucial NLP approaches, and an overview of NLP in industry.
Word embedding, Vector space model, language modelling, Neural language model, Word2Vec, GloVe, Fasttext, ELMo, BERT, distilBER, roBERTa, sBERT, Transformer, Attention
Natural language processing and transformer modelsDing Li
The document discusses several approaches for text classification using machine learning algorithms:
1. Count the frequency of individual words in tweets and sum for each tweet to create feature vectors for classification models like regression. However, this loses some word context information.
2. Use Bayes' rule and calculate word probabilities conditioned on class to perform naive Bayes classification. Laplacian smoothing is used to handle zero probabilities.
3. Incorporate word n-grams and context by calculating word probabilities within n-gram contexts rather than independently. This captures more linguistic information than the first two approaches.
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingYoung Seok Kim
Review of paper
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
ArXiv link: https://arxiv.org/abs/1810.04805
YouTube Presentation: https://youtu.be/GK4IO3qOnLc
(Slides are written in English, but the presentation is done in Korean)
Building, Evaluating, and Optimizing your RAG App for ProductionSri Ambati
The document discusses optimizing question answering systems called RAG (Retrieve-and-Generate) stacks. It outlines challenges with naive RAG approaches and proposes solutions like improved data representations, advanced retrieval techniques, and fine-tuning large language models. Table stakes optimizations include tuning chunk sizes, prompt engineering, and customizing LLMs. More advanced techniques involve small-to-big retrieval, multi-document agents, embedding fine-tuning, and LLM fine-tuning.
1) Transformers use self-attention to solve problems with RNNs like vanishing gradients and parallelization. They combine CNNs and attention.
2) Transformers have encoder and decoder blocks. The encoder models input and decoder models output. Variations remove encoder (GPT) or decoder (BERT) for language modeling.
3) GPT-3 is a large Transformer with 175B parameters that can perform many NLP tasks but still has safety and bias issues.
This document provides an overview of building, evaluating, and optimizing a RAG (Retrieve-and-Generate) conversational agent for production. It discusses setting up the development environment, prototyping the initial system, addressing challenges when moving to production like latency, costs, and quality issues. It also covers approaches for systematically evaluating the system, including using LLMs as judges, and experimenting and optimizing components like retrieval and generation through configuration tuning, model fine-tuning, and customizing the pipeline.
Question Answering - Application and ChallengesJens Lehmann
This document provides an overview of question answering applications and challenges. It defines question answering as receiving natural language questions and providing concise answers. Recent developments in question answering systems are discussed, including IBM Watson. Challenges for question answering over semantic data are explored, such as lexical gaps, ambiguity, granularity, and alternative resources. Large-scale linguistic resources and machine learning approaches for question answering are also covered. Applications of question answering technologies are examined.
This document discusses neural network models for natural language processing tasks like machine translation. It describes how recurrent neural networks (RNNs) were used initially but had limitations in capturing long-term dependencies and parallelization. The encoder-decoder framework addressed some issues but still lost context. Attention mechanisms allowed focusing on relevant parts of the input and using all encoded states. Transformers replaced RNNs entirely with self-attention and encoder-decoder attention, allowing parallelization while generating a richer representation capturing word relationships. This revolutionized NLP tasks like machine translation.
An introduction to the Transformers architecture and BERTSuman Debnath
The transformer is one of the most popular state-of-the-art deep (SOTA) learning architectures that is mostly used for natural language processing (NLP) tasks. Ever since the advent of the transformer, it has replaced RNN and LSTM for various tasks. The transformer also created a major breakthrough in the field of NLP and also paved the way for new revolutionary architectures such as BERT.
Regulating Generative AI - LLMOps pipelines with TransparencyDebmalya Biswas
The growing adoption of Gen AI, esp. LLMs, has re-ignited the discussion around AI Regulations — to ensure that AI/ML systems are responsibly trained and deployed. Unfortunately, this effort is complicated by multiple governmental organizations and regulatory bodies releasing their own guidelines and policies with little to no agreement on the definition of terms.
Rather than trying to understand and regulate all types of AI, we recommend a different (and practical) approach in this talk based on AI Transparency —
to transparently outline the capabilities of the AI system based on its training methodology and set realistic expectations with respect to what it can (and cannot) do.
We outline LLMOps architecture patterns and show how the proposed approach can be integrated at different stages of the LLMOps pipeline capturing the model's capabilities. In addition, the AI system provider also specifies scenarios where (they believe that) the system can make mistakes, and recommends a ‘safe’ approach with guardrails for those scenarios.
This document provides an overview of natural language processing (NLP). It discusses how NLP allows computers to understand human language through techniques like speech recognition, text analysis, and language generation. The document outlines the main components of NLP including natural language understanding and natural language generation. It also describes common NLP tasks like part-of-speech tagging, named entity recognition, and dependency parsing. Finally, the document explains how to build an NLP pipeline by applying these techniques in a sequential manner.
#AI + #ML + #Robotics combination is a game-changer, so #ServerlessTO members were lucky to have Alex Barbosa Coqueiro - Public Sector Solutions Architect Manager at AWS Canada, introduce us to AWS Robomaker & AWS DeepRacer!
Alex also talked about managed #ReinforcementLearning (RL) with Amazon SageMaker, and compared DeepRacer to Donkey Car – open source project for small-scale self-driving cars.
Video at: https://youtu.be/t8bo9gOveoo
The document discusses building a minimum viable product (MVP) on AWS. It covers what an MVP is, development iterations using sprints and standups, continuously shipping releases, prioritizing tasks, avoiding anti-patterns like over-engineering, and using AWS services like Elastic Beanstalk, Lambda, API Gateway for deploying monoliths or microservices. It also discusses data models and common AWS services for relational, document, graph and other databases.
Building Event-driven Architectures with Amazon EventBridge James Beswick
Presented at Mountain View Cloud Native Computing Meetup Group on 5/14/2020.
As you build new services across distributed applications, you need to think more about how these services communicate. In moving to event-driven mode, there are numerous factors to consider, including:
• How to scale without upstream services becoming a blocker
• How to manage event routing to downstream destinations
• How to detect new events
• Choosing between a notification pattern and a state transfer pattern
In this session, James will discuss how to think about which strategy is right for your application and how to build a fully event-driven application.
The document discusses using artificial intelligence and machine learning for autonomous racing. It describes using reinforcement learning to train models to control a miniature race car called AWS DeepRacer. The models are trained in a 3D racing simulator and learn to navigate a track by receiving rewards or penalties. Researchers are exploring using techniques like behavioral cloning and object detection for autonomous driving applications.
AWS Lambda Powertools is an open-source library to help organizations discover and incorporate serverless best practices early and quickly. In two years, Powertools went from a tiny pilot program to a fast-growing project. This rapid growth led to challenges ranging from balancing new features with operational excellence, triaging bug reports and RFCs, and scaling and redesigning documentation, to lowering the bar for contribution and providing a public road map. In this session, learn about the current state of Lambda Powertools, how this growth was supported, key lessons learned in the past two years, and what’s next on the horizon.
AWS Startup Garage - Building your MVP on AWSCobus Bernard
Tips for startups on how to think about their MVP, how to make the most of developer time, pitfalls to avoid and sample architectures of how to build the MVP on AWS.
Automatic Labelling and Model Tuning with Amazon SageMaker - AWS Summit SydneyAmazon Web Services
Developing machine learning models requires a lot of effort which often needs to be repeated over time as data distributions change. In this session you will learn about some of the latest concepts in Automatic Machine Learning including how to apply them to speed up development and achieve robust models over time. You will learn how to run a custom labelling job using Amazon SageMaker Ground Truth to build a larger data set to fine-tune your model. You will also learn how to tune your model’s hyperparameters using Amazon SageMaker’s Automatic Model Tuning capabilities and understand the theory of how bayesian optimisation is automatically applied for more accurate results and faster tuning.
This workshop requires a laptop and administrative access to your own AWS account.
How Pokémon’s SecOps team enables its business - SDD328 - AWS re:Inforce 2019 Amazon Web Services
Pokémon’s SecOps team built an automated PII datalake pipeline allowing them to categorize data into profiles and manage permissions. We discuss how, using AWS Lambda, Amazon DynamoDB, and Amazon Simple Queue Service (Amazon SQS), they can validate any person in Active Directory, build the approval to the appropriate manager, write to DDB with a TTL, and push the appropriate access controls. This has two benefits: First, Pokémon can reuse this architecture for other permissions-based business processes, meaning a security layer can be added at the beginning. Second, it frees up security engineers to tackle larger, more important challenges.
Learn to identify use cases for machine learning (ML), acquire best practices to frame problems in a way that key stakeholders and senior management can understand and support, and help create the right conditions for delivering successful ML-based solutions to your business.
Revving up with Reinforcement Learning by Ricardo SueirasAlex Cachia
In this session I will share my journey that started with me taking my children discarded toys and trying to get them to drive themselves, to a fully autonomous driving model car.
The document discusses best practices for creating effective pitch decks to present startup ideas to investors. It provides examples of pitch decks from successful companies like Airbnb, LinkedIn, and Square that emphasize clearly explaining the problem, solution, and business model. It also stresses the importance of demonstrating growth and traction as well as having a strong founding team. Common pitfalls to avoid include not grabbing attention quickly, lacking a clear thesis, including too much unrelated content, and failing to address typical concerns about the business. The overall message is that an effective pitch focuses on problem, solution, business model, growth, and team while avoiding vague, disorganized presentations.
The document discusses best practices for creating effective pitch decks to present startup ideas to investors. It provides examples of pitch decks from successful companies like Airbnb and LinkedIn that emphasize clearly explaining the problem, solution, and business model. It also stresses the importance of demonstrating growth and traction as well as having a strong founding team. Common pitfalls to avoid include not having a clear thesis, including too much unrelated content, and failing to address typical concerns about the business. The key is to iterate the pitch deck while focusing on conveying the core of the startup idea rather than overly focusing on design.
The document discusses Amazon Web Services' machine learning and deep learning services. It describes services like Amazon SageMaker, Amazon Rekognition, and AWS DeepLens. It provides information on building and deploying machine learning models using these services and platforms.
Introduction to the Well-Architected Framework and Tool - SVC212 - Santa Clar...Amazon Web Services
The document provides an overview of the AWS Well-Architected Framework and Tool. It discusses the framework's history and components, including pillars, design principles, and questions to evaluate architectures. It also describes how to apply the framework through self-service reviews, partner reviews, or AWS Solutions Architect led reviews, and resources available like whitepapers, training, and the online tool.
Transform with Cloud to drive your Future | AWS Summit Tel Aviv 2019Amazon Web Services
Innovation and agility are not for startups only, getting a competitive edge requires combining cloud-based tools and business challenges in innovative ways to drive operating efficiency, open new revenue streams, and evolve customer engagement models. In this session we will imagine the future. We will explore how to transform with Cloud to drive your future.
In this session we will explore the role of rapid prototyping to accelerate business outcomes; diving deep on how to unlock the power of the prototyping model. Delivering outcomes in a day, a week, or a month is often challenging, so we'll explore how to define deliverables, arrange resources, run sprints, do Kanban, and produce outcomes. Hear about the AWS Rapid Innovation Prototyping methodology and other techniques to test and validate value early.
This document contains slides from a presentation given by Francisco Ruiz on creating successful pitches and avoiding common mistakes. The presentation discusses key elements to include in a pitch deck such as clearly presenting the problem, solution, business model, growth and traction, and team. It also identifies some common pitches sins to avoid like burying the main point, having an unclear thesis, including too much information, pretending there is no competition, and unrealistic assumptions about the market. The presentation emphasizes iteratively refining the pitch deck but notes the presenter's role is not as a "PowerPoint master."
This document contains slides from a presentation given by Francisco Ruiz on creating successful pitches and avoiding common mistakes. The presentation discusses key elements to include in a pitch deck such as the problem, solution, business model, growth and traction, and team. It also identifies some "deadly sins" to avoid like burying the main point, lacking focus, including irrelevant information, pretending the company has no competition, and unrealistic projections. The presentation emphasizes iteratively refining the pitch deck but notes the presenter's role is not as a "PowerPoint master."
Similar to Fine-tuning BERT for Question Answering (20)
Recent Advances in Natural Language ProcessingApache MXNet
The document provides an overview of recent advances in natural language processing (NLP), including traditional methods like bag-of-words models and word2vec, as well as more recent contextualized word embedding techniques like ELMo and BERT. It discusses applications of NLP like text classification, language modeling, machine translation and question answering, and how different models like recurrent neural networks, convolutional neural networks, and transformer models are used.
GluonNLP is a deep learning toolkit for Natural Language Processing. These slides covers the motivation behind the creation of the toolkit and what is available in it. Go try it at https://gluon-nlp.mxnet.io!
Introduction to object tracking with Deep LearningApache MXNet
The document discusses object tracking using deep learning. It defines object tracking as locating an object across consecutive video frames. It notes applications in security, road safety, and entertainment. Object tracking differs from object detection in that the object class is unknown during training and tracking considers objects across time rather than individual frames. Challenges include objects leaving the screen or changing pose. The document discusses metrics for evaluating trackers, including accuracy and robustness, and surveys popular modern trackers.
This presentation introduces the topic of computer vision, especially through the lense of Deep Learning.
Go build! https://gluon-cv.mxnet.io
Slides: Thomas Delteil
Image Segmentation: Approaches and ChallengesApache MXNet
This slides go over the problem of deep semantic segmentation. It covers the different approaches taken, from hourglass autoencoder to pyramid networks.
Slides by Thomas Delteil
Generative Adversarial Networks (GANs) using Apache MXNetApache MXNet
The document provides an overview of generative adversarial networks (GANs) using Apache MXNet. It introduces GANs and deep learning concepts. It then demonstrates how to implement GANs using MXNet with examples like DCGAN. Finally, it discusses other GAN models and provides resources for using MXNet on AWS.
Deep Learning With Apache MXNet On Video by Ben Taylor @ ziff.aiApache MXNet
This talk will go over using Apache MXNet on video streams such as security footage from Ring, or live XBOX video data to perform inference and indexing. This can be used to classify video events, detect anomalies in normal behavior, and search. This talk will focus on using FFMPEG for feeding Apache MXNet models for fast inference throughput and performance. This talk will also discuss the difference between frame level inference, and frame buffer inference (comprehending a temporal video event).
Links to videos on the slides:
IntelAct: Winner, Visual Doom AI Competition, Full Deathmatch: https://www.youtube.com/watch?v=947bSUtuSQ0
GPU assisted call of duty processing, prep for AI auto-play: https://www.youtube.com/watch?v=gTXOYzSC_ZE
Presented at https://www.meetup.com/deep-learning-with-mxnet/events/258901722/
Using Java to deploy Deep Learning models with MXNetApache MXNet
The document discusses deep learning and the Apache MXNet framework. It provides an introduction to deep learning concepts like neural networks and machine learning. It then describes MXNet as an open source deep learning framework that supports multiple languages including Java. It outlines how to get started with MXNet's Java API and discusses some technical challenges around Java memory management when using deep learning models.
MXNet is a flexible and efficient deep learning framework that is programmable in multiple languages and scalable across multiple GPUs and machines. It originated from the DMLC community and is now an Apache incubating project. MXNet provides low-level NDArray and Symbol APIs as well as high-level Gluon APIs and has additional toolkits like GluonCV for computer vision tasks. MXNet supports distributed training across multiple machines using parameter servers and can serve trained models for low-latency inference using the MXNet Model Server.
This document provides an overview of recurrent neural networks (RNNs) and long short-term memory (LSTM) networks. It discusses how RNNs can be used for sequence modeling tasks like sentiment analysis, machine translation, and speech recognition by incorporating context or memory from previous steps. LSTMs are presented as an improvement over basic RNNs that can learn long-term dependencies in sequences using forget gates, input gates, and output gates to control the flow of information through the network.
What is Deep Learning
Rise of Deep Learning
Phases of Deep Learning - Training and Inference
AI & Limitations of Deep Learning
Apache MXNet History, Apache MXNet concepts
How to use Apache MXNet and Spark together for Distributed Inference.
The document discusses Apache MXNet, an open-source deep learning framework. It provides an overview of MXNet's history and key features, including support for multiple programming languages, an ecosystem of tools like GluonCV and GluonNLP, and model serving capabilities. It also describes MXNet's use of ONNX for model interchange, integration with Keras, and performance optimization using technologies like CUDA, MKL, and TVM. The document highlights MXNet's large community and adoption by customers.
In this talk ONNX (Open Neural Network eXchange) is introduced, and the ONNX Model Zoo is used as the base for fine-tuning with AWS SageMaker and Apache MXNet's Gluon API. With a fine-tuned model trained on Caltech101, AWS GreenGrass is discussed for edge deployments and the TVM Stack is suggested as a method for optimising the inference of models on edge devices.
Presented by: Thom Lane at Linaro Connect Vancouver 2018 on 19th September 2018.
Distributed Inference with MXNet and SparkApache MXNet
Deep Learning has become ubiquitous with abundance of data, commoditization of compute and storage. Pre-trained models are readily available for many use-cases. Distributed Inference has many applications such as pre-computing results offline, backfilling historic data with predictions from state-of-the-art models, etc.,. Inference on large scale datasets comes with many challenges prevalent in distributed data processing. This presentation will show how to efficiently run deep learning prediction on large data sets, leveraging Apache Spark and Apache MXNet (incubating).
This presentation describes two major papers in multi-variate time-series using deep neural networks. The first paper, DeepAR was developed at Amazon to deal with forecasting of millions of items where the same model can be applied to millions of products. DeepAR is implemented as a built-in algorithm of Amazon SageMaker. Code example is provided.
The second paper, Long- and Short-Term Temporal Patterns with Deep Neural Networks is developed at CMU and introduces a novel way to detect both short term and long term seasonality in data through introduction of skip-rnn.
A Gluon implementation of the paper is provided in the presentation.
Inference on edge has an ever increasing performance for companies and thus it is crucial to be able to make models smaller. Compressing models can be loss-less or can result in loss of accuracy. This presentation provides a survey of compression techniques for deep learning models. It then describes different architectures of AWS IoT/Green Grass to combine on-device inference and GPU inference in a hub model. Additionally the presentation introduces MXNet, which has small footprint and efficient both for inference and training in distributed settings.
Building Content Recommendation Systems using MXNet GluonApache MXNet
Netflix competition triggered a flurry of research for recommendation engines. This presentation provides a survey of techniques and models for creating a recommender system. The presentation covers Matrix Factorisation, Factorisation Machines, Distributed Factorisation Machines, and DSSM networks as well provide code examples for developing a Matrix Factorisation in Gluon. At the end the presentation provides tips and tricks for large-scale, realtime recommender engines.
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
Immersive Learning That Works: Research Grounding and Paths ForwardLeonel Morgado
We will metaverse into the essence of immersive learning, into its three dimensions and conceptual models. This approach encompasses elements from teaching methodologies to social involvement, through organizational concerns and technologies. Challenging the perception of learning as knowledge transfer, we introduce a 'Uses, Practices & Strategies' model operationalized by the 'Immersive Learning Brain' and ‘Immersion Cube’ frameworks. This approach offers a comprehensive guide through the intricacies of immersive educational experiences and spotlighting research frontiers, along the immersion dimensions of system, narrative, and agency. Our discourse extends to stakeholders beyond the academic sphere, addressing the interests of technologists, instructional designers, and policymakers. We span various contexts, from formal education to organizational transformation to the new horizon of an AI-pervasive society. This keynote aims to unite the iLRN community in a collaborative journey towards a future where immersive learning research and practice coalesce, paving the way for innovative educational research and practice landscapes.
Current Ms word generated power point presentation covers major details about the micronuclei test. It's significance and assays to conduct it. It is used to detect the micronuclei formation inside the cells of nearly every multicellular organism. It's formation takes place during chromosomal sepration at metaphase.
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
The technology uses reclaimed CO₂ as the dyeing medium in a closed loop process. When pressurized, CO₂ becomes supercritical (SC-CO₂). In this state CO₂ has a very high solvent power, allowing the dye to dissolve easily.
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
Or: Beyond linear.
Abstract: Equivariant neural networks are neural networks that incorporate symmetries. The nonlinear activation functions in these networks result in interesting nonlinear equivariant maps between simple representations, and motivate the key player of this talk: piecewise linear representation theory.
Disclaimer: No one is perfect, so please mind that there might be mistakes and typos.
dtubbenhauer@gmail.com
Corrected slides: dtubbenhauer.com/talks.html
Authoring a personal GPT for your research and practice: How we created the Q...Leonel Morgado
Thematic analysis in qualitative research is a time-consuming and systematic task, typically done using teams. Team members must ground their activities on common understandings of the major concepts underlying the thematic analysis, and define criteria for its development. However, conceptual misunderstandings, equivocations, and lack of adherence to criteria are challenges to the quality and speed of this process. Given the distributed and uncertain nature of this process, we wondered if the tasks in thematic analysis could be supported by readily available artificial intelligence chatbots. Our early efforts point to potential benefits: not just saving time in the coding process but better adherence to criteria and grounding, by increasing triangulation between humans and artificial intelligence. This tutorial will provide a description and demonstration of the process we followed, as two academic researchers, to develop a custom ChatGPT to assist with qualitative coding in the thematic data analysis process of immersive learning accounts in a survey of the academic literature: QUAL-E Immersive Learning Thematic Analysis Helper. In the hands-on time, participants will try out QUAL-E and develop their ideas for their own qualitative coding ChatGPT. Participants that have the paid ChatGPT Plus subscription can create a draft of their assistants. The organizers will provide course materials and slide deck that participants will be able to utilize to continue development of their custom GPT. The paid subscription to ChatGPT Plus is not required to participate in this workshop, just for trying out personal GPTs during it.
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills MN
Travis Hills of Minnesota developed a method to convert waste into high-value dry fertilizer, significantly enriching soil quality. By providing farmers with a valuable resource derived from waste, Travis Hills helps enhance farm profitability while promoting environmental stewardship. Travis Hills' sustainable practices lead to cost savings and increased revenue for farmers by improving resource efficiency and reducing waste.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
ESPP presentation to EU Waste Water Network, 4th June 2024 “EU policies driving nutrient removal and recycling
and the revised UWWTD (Urban Waste Water Treatment Directive)”
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Leonel Morgado
Current descriptions of immersive learning cases are often difficult or impossible to compare. This is due to a myriad of different options on what details to include, which aspects are relevant, and on the descriptive approaches employed. Also, these aspects often combine very specific details with more general guidelines or indicate intents and rationales without clarifying their implementation. In this paper we provide a method to describe immersive learning cases that is structured to enable comparisons, yet flexible enough to allow researchers and practitioners to decide which aspects to include. This method leverages a taxonomy that classifies educational aspects at three levels (uses, practices, and strategies) and then utilizes two frameworks, the Immersive Learning Brain and the Immersion Cube, to enable a structured description and interpretation of immersive learning cases. The method is then demonstrated on a published immersive learning case on training for wind turbine maintenance using virtual reality. Applying the method results in a structured artifact, the Immersive Learning Case Sheet, that tags the case with its proximal uses, practices, and strategies, and refines the free text case description to ensure that matching details are included. This contribution is thus a case description method in support of future comparative research of immersive learning cases. We then discuss how the resulting description and interpretation can be leveraged to change immersion learning cases, by enriching them (considering low-effort changes or additions) or innovating (exploring more challenging avenues of transformation). The method holds significant promise to support better-grounded research in immersive learning.
7. (refresher) BERT (Devlin et al. 18): Pre-training
INPUT
WordPieces
Embeddings
Sentence
Embeddings
Position
Embeddings
Learned during the
(pre)training process
MASK
EMASK
In pre-training 15% of the input tokens are
masked for the masked LM task
There is also a sentence similarity task
First call deck for a high level introduction to Apache MXNet.
Task 1: Mask language model (MLM)
Task 2: Next sentence prediction
Note that the first token is always forced to be [CLS] — a placeholder that will be used later for prediction in downstream tasks.