Fine-tuning BERT for Question Answering

•Download as PPTX, PDF•

3 likes•3,261 views

This deck covers the problem of fine-tuning a pre-trained BERT model for the task of Question Answering. Check out the GluonNLP model zoo here for models and tutorials: http://gluon-nlp.mxnet.io/model_zoo/bert/index.html Slides: Thomas Delteil

© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Automatic Question Answering
through BERT fine-tuning

© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Question Answering task
Q: _______________ ?
A: _______
Or
A: impossible

© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
SQuAD:
The Stanford Question Answering Dataset
SQuAD v1.1: Single good answer
SQuAD v2: One ore more good answers and questions that are impossible to answer

© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
SQuAD:
The Stanford Question Answering Dataset
Partial answer
query
answer

© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
SQuAD:
The Stanford Question Answering Dataset
https://rajpurkar.github.io/SQuAD-explorer/

Bi-Directional Attention Flow
for machine comprehension (BiDAF), Seo et al. 2018

(refresher) BERT (Devlin et al. 18): Pre-training
INPUT
WordPieces
Embeddings
Sentence
Embeddings
Position
Embeddings
Learned during the
(pre)training process
MASK
EMASK
In pre-training 15% of the input tokens are
masked for the masked LM task
There is also a sentence similarity task

http://jalammar.github.io/illustrated-bert/
(refresher) BERT (Devlin et al. 18): Pre-training

Training for Question Answering task: (Modified from: original paper)
Fine-tuning
Add small weight matrix W:
𝒉 𝐿
𝑖
𝑾 𝑆𝑃𝐴𝑁_𝑆𝑇𝐴𝑅𝑇_𝐸𝑁𝐷
BERT (Devlin et al. 18): Fine-Tuning for QA
impossible = start[CLS]+end[CLS]
Modified network in Gluon (Bert + Dense Layer)
max(𝒔𝒕𝒂𝒓𝒕) + max(𝒆𝒏𝒅) − (𝑠𝑡𝑎𝑟𝑡 𝐶𝐿𝑆 + 𝑒𝑛𝑑 𝐶𝐿𝑆 ) < 𝑖𝑚𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒_𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑
Question is impossible to answer if:

© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Lab

© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
Go build!
• http://gluon-nlp.mxnet.io/
Get help:
• https://discuss.mxnet.io/

The document discusses question answering over knowledge graphs. It introduces question answering and describes how knowledge graphs can be used to answer natural language questions. It summarizes three proposed papers on learning knowledge graphs for question answering through dialogs, automated template generation for question answering over knowledge graphs, and generating knowledge questions from knowledge graphs. The document also covers motivation for question answering, defining characteristics, different methods like template-based and dialog-based systems, evaluating knowledge quality, and examples of question answering systems.

Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...

Edureka!

** NLP Using Python: - https://www.edureka.co/python-natural-language-processing-course ** This Edureka PPT will provide you with a comprehensive and detailed knowledge of Natural Language Processing, popularly known as NLP. You will also learn about the different steps involved in processing the human language like Tokenization, Stemming, Lemmatization and much more along with a demo on each one of the topics. The following topics covered in this PPT: 1. The Evolution of Human Language 2. What is Text Mining? 3. What is Natural Language Processing? 4. Applications of NLP 5. NLP Components and Demo Follow us to never miss an update in the future. Instagram: https://www.instagram.com/edureka_learning/ Facebook: https://www.facebook.com/edurekaIN/ Twitter: https://twitter.com/edurekain LinkedIn: https://www.linkedin.com/company/edureka

[216]네이버 검색 사용자를 만족시켜라! 의도파악과 의미검색

NAVER D2

Lecture 4: Transformers (Full Stack Deep Learning - Spring 2021)

Sergey Karayev

This document discusses a lecture on transfer learning and transformers. It begins with an outline of topics to be covered, including transfer learning in computer vision, embeddings and language models, ELMO/ULMFit as "NLP's ImageNet Moment", transformers, attention in detail, and BERT, GPT-2, DistillBERT and T5. It then goes on to provide slides and explanations on these topics, discussing how transfer learning works, word embeddings, language models like Word2Vec, ELMO, ULMFit, the transformer architecture, attention mechanisms, and prominent transformer models.

Transformer Introduction (Seminar Material)

Yuta Niki

Tutorial on Question Answering Systems

Saeedeh Shekarpour

The document provides an overview of question answering systems, including their evolution from information retrieval, common evaluation benchmarks like TREC and CLEF, and examples of major QA projects like Watson. It also discusses the movement towards leveraging semantic technologies and linked open data to power next generation QA systems, as seen in projects like SINA which transform natural language queries into formal queries over structured knowledge bases.

The document discusses the BERT model for natural language processing. It begins with an introduction to BERT and how it achieved state-of-the-art results on 11 NLP tasks in 2018. The document then covers related work on language representation models including ELMo and GPT. It describes the key aspects of the BERT model, including its bidirectional Transformer architecture, pre-training using masked language modeling and next sentence prediction, and fine-tuning for downstream tasks. Experimental results are presented showing BERT outperforming previous models on the GLUE benchmark, SQuAD 1.1, SQuAD 2.0, and SWAG. Ablation studies examine the importance of the pre-training tasks and the effect of model size.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Minh Pham

A Comprehensive Review of Large Language Models for.pptx

SaiPragnaKancheti

The document presents a review of large language models (LLMs) for code generation. It discusses different types of LLMs including left-to-right, masked, and encoder-decoder models. Existing models for code generation like Codex, GPT-Neo, GPT-J, and CodeParrot are compared. A new model called PolyCoder with 2.7 billion parameters trained on 12 programming languages is introduced. Evaluation results show PolyCoder performs less well than comparably sized models but outperforms others on C language tasks. In general, performance improves with larger models and longer training, but training solely on code can be sufficient or advantageous for some languages.

GPT-2: Language Models are Unsupervised Multitask Learners

Young Seok Kim

This document summarizes a technical paper about GPT-2, an unsupervised language model created by OpenAI. GPT-2 is a transformer-based model trained on a large corpus of internet text using byte-pair encoding. The paper describes experiments showing GPT-2 can perform various NLP tasks like summarization, translation, and question answering with limited or no supervision, though performance is still below supervised models. It concludes that unsupervised task learning is a promising area for further research.

Using Text Embeddings for Information Retrieval

Bhaskar Mitra

Neural text embeddings provide dense vector representations of words and documents that encode various notions of semantic relatedness. Word2vec models typical similarity by representing words based on neighboring context words, while models like latent semantic analysis encode topical similarity through co-occurrence in documents. Dual embedding spaces can separately model both typical and topical similarities. Recent work has applied text embeddings to tasks like query auto-completion, session modeling, and document ranking, demonstrating their ability to capture semantic relationships between text beyond just words.

Word2Vec

hyunyoung Lee

Neural Language Generation Head to Toe

Hady Elsahar

This is a gentle introduction to Natural language Generation (NLG) using deep learning. If you are a computer science practitioner with basic knowledge about Machine learning. This is a gentle intuitive introduction to Language Generation using Neural Networks. It takes you in a journey from the basic intuitions behind modeling language and how to model probabilities of sequences to recurrent neural networks to large Transformers models that you have seen in the news like GPT2/GPT3. The tutorial wraps up with a summary on the ethical implications of training such large language models on uncurated text from the internet.

A Panorama of Natural Language Processing

Ted Xiao

Word embedding

ShivaniChoudhary74

Natural language processing and transformer models

Ding Li

The document discusses several approaches for text classification using machine learning algorithms: 1. Count the frequency of individual words in tweets and sum for each tweet to create feature vectors for classification models like regression. However, this loses some word context information. 2. Use Bayes' rule and calculate word probabilities conditioned on class to perform naive Bayes classification. Laplacian smoothing is used to handle zero probabilities. 3. Incorporate word n-grams and context by calculating word probabilities within n-gram contexts rather than independently. This captures more linguistic information than the first two approaches.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Young Seok Kim

Building, Evaluating, and Optimizing your RAG App for Production

Sri Ambati

The document discusses optimizing question answering systems called RAG (Retrieve-and-Generate) stacks. It outlines challenges with naive RAG approaches and proposes solutions like improved data representations, advanced retrieval techniques, and fine-tuning large language models. Table stakes optimizations include tuning chunk sizes, prompt engineering, and customizing LLMs. More advanced techniques involve small-to-big retrieval, multi-document agents, embedding fine-tuning, and LLM fine-tuning.

Transformers AI PPT.pptx

RahulKumar854607

1) Transformers use self-attention to solve problems with RNNs like vanishing gradients and parallelization. They combine CNNs and attention. 2) Transformers have encoder and decoder blocks. The encoder models input and decoder models output. Variations remove encoder (GPT) or decoder (BERT) for language modeling. 3) GPT-3 is a large Transformer with 175B parameters that can perform many NLP tasks but still has safety and bias issues.

presentation.pdf

caa28steve

This document provides an overview of building, evaluating, and optimizing a RAG (Retrieve-and-Generate) conversational agent for production. It discusses setting up the development environment, prototyping the initial system, addressing challenges when moving to production like latency, costs, and quality issues. It also covers approaches for systematically evaluating the system, including using LLMs as judges, and experimenting and optimizing components like retrieval and generation through configuration tuning, model fine-tuning, and customizing the pipeline.

Question Answering - Application and Challenges

Jens Lehmann

This document provides an overview of question answering applications and challenges. It defines question answering as receiving natural language questions and providing concise answers. Recent developments in question answering systems are discussed, including IBM Watson. Challenges for question answering over semantic data are explored, such as lexical gaps, ambiguity, granularity, and alternative resources. Large-scale linguistic resources and machine learning approaches for question answering are also covered. Applications of question answering technologies are examined.

Natural Language Processing (NLP)

Yuriy Guts

NLP using transformers

Arvind Devaraj

This document discusses neural network models for natural language processing tasks like machine translation. It describes how recurrent neural networks (RNNs) were used initially but had limitations in capturing long-term dependencies and parallelization. The encoder-decoder framework addressed some issues but still lost context. Attention mechanisms allowed focusing on relevant parts of the input and using all encoded states. Transformers replaced RNNs entirely with self-attention and encoder-decoder attention, allowing parallelization while generating a richer representation capturing word relationships. This revolutionized NLP tasks like machine translation.

An introduction to the Transformers architecture and BERT

Suman Debnath

Regulating Generative AI - LLMOps pipelines with Transparency

Debmalya Biswas

The growing adoption of Gen AI, esp. LLMs, has re-ignited the discussion around AI Regulations — to ensure that AI/ML systems are responsibly trained and deployed. Unfortunately, this effort is complicated by multiple governmental organizations and regulatory bodies releasing their own guidelines and policies with little to no agreement on the definition of terms. Rather than trying to understand and regulate all types of AI, we recommend a different (and practical) approach in this talk based on AI Transparency — to transparently outline the capabilities of the AI system based on its training methodology and set realistic expectations with respect to what it can (and cannot) do. We outline LLMOps architecture patterns and show how the proposed approach can be integrated at different stages of the LLMOps pipeline capturing the model's capabilities. In addition, the AI system provider also specifies scenarios where (they believe that) the system can make mistakes, and recommends a ‘safe’ approach with guardrails for those scenarios.

Natural language processing (nlp)

Kuppusamy P

This document provides an overview of natural language processing (NLP). It discusses how NLP allows computers to understand human language through techniques like speech recognition, text analysis, and language generation. The document outlines the main components of NLP including natural language understanding and natural language generation. It also describes common NLP tasks like part-of-speech tagging, named entity recognition, and dependency parsing. Finally, the document explains how to build an NLP pipeline by applying these techniques in a sequential manner.

Racing with Artificial Intelligence

Daniel Zivkovic

#AI + #ML + #Robotics combination is a game-changer, so #ServerlessTO members were lucky to have Alex Barbosa Coqueiro - Public Sector Solutions Architect Manager at AWS Canada, introduce us to AWS Robomaker & AWS DeepRacer! Alex also talked about managed #ReinforcementLearning (RL) with Amazon SageMaker, and compared DeepRacer to Donkey Car – open source project for small-scale self-driving cars. Video at: https://youtu.be/t8bo9gOveoo

Tools for building your Startup on AWS

Rob De Feo

The document discusses building a minimum viable product (MVP) on AWS. It covers what an MVP is, development iterations using sprints and standups, continuously shipping releases, prioritizing tasks, avoiding anti-patterns like over-engineering, and using AWS services like Elastic Beanstalk, Lambda, API Gateway for deploying monoliths or microservices. It also discusses data models and common AWS services for relational, document, graph and other databases.

What's hot

Bert

Abdallah Bashir

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Minh Pham

A Comprehensive Review of Large Language Models for.pptx

SaiPragnaKancheti

GPT-2: Language Models are Unsupervised Multitask Learners

Young Seok Kim

Using Text Embeddings for Information Retrieval

Bhaskar Mitra

Word2Vec

hyunyoung Lee

Neural Language Generation Head to Toe

Hady Elsahar

A Panorama of Natural Language Processing

Ted Xiao

Word embedding

ShivaniChoudhary74

Natural language processing and transformer models

Ding Li

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Young Seok Kim

Building, Evaluating, and Optimizing your RAG App for Production

Sri Ambati

Transformers AI PPT.pptx

RahulKumar854607

presentation.pdf

caa28steve

Question Answering - Application and Challenges

Jens Lehmann

Natural Language Processing (NLP)

Yuriy Guts

NLP using transformers

Arvind Devaraj

An introduction to the Transformers architecture and BERT

Suman Debnath

Regulating Generative AI - LLMOps pipelines with Transparency

Debmalya Biswas

Natural language processing (nlp)

Kuppusamy P

What's hot (20)

Bert

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

A Comprehensive Review of Large Language Models for.pptx

GPT-2: Language Models are Unsupervised Multitask Learners

Using Text Embeddings for Information Retrieval

Word2Vec

Neural Language Generation Head to Toe

A Panorama of Natural Language Processing

Word embedding

Natural language processing and transformer models

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Building, Evaluating, and Optimizing your RAG App for Production

Transformers AI PPT.pptx

presentation.pdf

Question Answering - Application and Challenges

Natural Language Processing (NLP)

NLP using transformers

An introduction to the Transformers architecture and BERT

Regulating Generative AI - LLMOps pipelines with Transparency

Natural language processing (nlp)

DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018

Apache MXNet

Apache MXNet EcoSystem - ACNA2018

Apache MXNet

The document discusses Apache MXNet, an open-source deep learning framework. It provides an overview of MXNet's history and key features, including support for multiple programming languages, an ecosystem of tools like GluonCV and GluonNLP, and model serving capabilities. It also describes MXNet's use of ONNX for model interchange, integration with Keras, and performance optimization using technologies like CUDA, MKL, and TVM. The document highlights MXNet's large community and adoption by customers.

ONNX and Edge Deployments

Apache MXNet

In this talk ONNX (Open Neural Network eXchange) is introduced, and the ONNX Model Zoo is used as the base for fine-tuning with AWS SageMaker and Apache MXNet's Gluon API. With a fine-tuned model trained on Caltech101, AWS GreenGrass is discussed for edge deployments and the TVM Stack is suggested as a method for optimising the inference of models on edge devices. Presented by: Thom Lane at Linaro Connect Vancouver 2018 on 19th September 2018.

Distributed Inference with MXNet and Spark

Apache MXNet

Deep Learning has become ubiquitous with abundance of data, commoditization of compute and storage. Pre-trained models are readily available for many use-cases. Distributed Inference has many applications such as pre-computing results offline, backfilling historic data with predictions from state-of-the-art models, etc.,. Inference on large scale datasets comes with many challenges prevalent in distributed data processing. This presentation will show how to efficiently run deep learning prediction on large data sets, leveraging Apache Spark and Apache MXNet (incubating).

Multivariate Time Series

Apache MXNet

This presentation describes two major papers in multi-variate time-series using deep neural networks. The first paper, DeepAR was developed at Amazon to deal with forecasting of millions of items where the same model can be applied to millions of products. DeepAR is implemented as a built-in algorithm of Amazon SageMaker. Code example is provided. The second paper, Long- and Short-Term Temporal Patterns with Deep Neural Networks is developed at CMU and introduces a novel way to detect both short term and long term seasonality in data through introduction of skip-rnn. A Gluon implementation of the paper is provided in the presentation.

AI On the Edge: Model Compression

Apache MXNet

Inference on edge has an ever increasing performance for companies and thus it is crucial to be able to make models smaller. Compressing models can be loss-less or can result in loss of accuracy. This presentation provides a survey of compression techniques for deep learning models. It then describes different architectures of AWS IoT/Green Grass to combine on-device inference and GPU inference in a hub model. Additionally the presentation introduces MXNet, which has small footprint and efficient both for inference and training in distributed settings.

Building Content Recommendation Systems using MXNet Gluon

Apache MXNet

Netflix competition triggered a flurry of research for recommendation engines. This presentation provides a survey of techniques and models for creating a recommender system. The presentation covers Matrix Factorisation, Factorisation Machines, Distributed Factorisation Machines, and DSSM networks as well provide code examples for developing a Matrix Factorisation in Gluon. At the end the presentation provides tips and tricks for large-scale, realtime recommender engines.

More from Apache MXNet (20)

Recent Advances in Natural Language Processing

Introduction to GluonNLP

Introduction to object tracking with Deep Learning

Introduction to GluonCV

Introduction to Computer Vision

Image Segmentation: Approaches and Challenges

Introduction to Deep face detection and recognition

Generative Adversarial Networks (GANs) using Apache MXNet

Deep Learning With Apache MXNet On Video by Ben Taylor @ ziff.ai

Using Java to deploy Deep Learning models with MXNet

AI powered emotion recognition: From Inception to Production - Global AI Conf...

MXNet Paris Workshop - Intro To MXNet

Apache MXNet ODSC West 2018

DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018

Apache MXNet EcoSystem - ACNA2018

ONNX and Edge Deployments

Distributed Inference with MXNet and Spark

Multivariate Time Series

AI On the Edge: Model Compression

Building Content Recommendation Systems using MXNet Gluon

Recently uploaded

THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...

Abdul Wali Khan University Mardan,kP,Pakistan

hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills

Immersive Learning That Works: Research Grounding and Paths Forward

Leonel Morgado

We will metaverse into the essence of immersive learning, into its three dimensions and conceptual models. This approach encompasses elements from teaching methodologies to social involvement, through organizational concerns and technologies. Challenging the perception of learning as knowledge transfer, we introduce a 'Uses, Practices & Strategies' model operationalized by the 'Immersive Learning Brain' and ‘Immersion Cube’ frameworks. This approach offers a comprehensive guide through the intricacies of immersive educational experiences and spotlighting research frontiers, along the immersion dimensions of system, narrative, and agency. Our discourse extends to stakeholders beyond the academic sphere, addressing the interests of technologists, instructional designers, and policymakers. We span various contexts, from formal education to organizational transformation to the new horizon of an AI-pervasive society. This keynote aims to unite the iLRN community in a collaborative journey towards a future where immersive learning research and practice coalesce, paving the way for innovative educational research and practice landscapes.

Basics of crystallography, crystal systems, classes and different forms

MaheshaNanjegowda

molar-distalization in orthodontics-seminar.pptx

Anagha Prasad

aziz sancar nobel prize winner: from mardin to nobel

İsa Badur

Medical Orthopedic PowerPoint Templates.pptx

terusbelajar5

Micronuclei test.M.sc.zoology.fisheries.

Aditi Bajpai

Compexometric titration/Chelatorphy titration/chelating titration

Vandana Devesh Sharma

Classification Metal ion ion indicators Masking and demasking reagents Estimation of Magnisium sulphate Calcium gluconate Complexometric Titration/ chelatometry titration/chelating titration, introduction, Types- 1.Direct Titration 2.Back Titration 3.Replacement Titration 4.Indirect Titration Masking agent, Demasking agents formation of complex comparition between masking and demasking agents, Indicators/Metal ion indicators/ Metallochromic indicators/pM indicators, Visual Technique,PM indicators (metallochromic), Indicators of pH, Redox Indicators Instrumental Techniques-Photometry Potentiometry Miscellaneous methods. Complex titration with EDTA.

The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx

MAGOTI ERNEST

Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024). Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).

waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf

LengamoLAppostilic

The debris of the ‘last major merger’ is dynamically young

Sérgio Sacani

The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the ‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space, because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago. We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data 1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’ did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within the last few Gyr, consistent with the body of work surrounding the VRM.

Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...

University of Maribor

8.Isolation of pure cultures and preservation of cultures.pdf

by6843629

Equivariant neural networks and representation theory

Daniel Tubbenhauer

Or: Beyond linear. Abstract: Equivariant neural networks are neural networks that incorporate symmetries. The nonlinear activation functions in these networks result in interesting nonlinear equivariant maps between simple representations, and motivate the key player of this talk: piecewise linear representation theory. Disclaimer: No one is perfect, so please mind that there might be mistakes and typos. dtubbenhauer@gmail.com Corrected slides: dtubbenhauer.com/talks.html

Authoring a personal GPT for your research and practice: How we created the Q...

Leonel Morgado

Thematic analysis in qualitative research is a time-consuming and systematic task, typically done using teams. Team members must ground their activities on common understandings of the major concepts underlying the thematic analysis, and define criteria for its development. However, conceptual misunderstandings, equivocations, and lack of adherence to criteria are challenges to the quality and speed of this process. Given the distributed and uncertain nature of this process, we wondered if the tasks in thematic analysis could be supported by readily available artificial intelligence chatbots. Our early efforts point to potential benefits: not just saving time in the coding process but better adherence to criteria and grounding, by increasing triangulation between humans and artificial intelligence. This tutorial will provide a description and demonstration of the process we followed, as two academic researchers, to develop a custom ChatGPT to assist with qualitative coding in the thematic data analysis process of immersive learning accounts in a survey of the academic literature: QUAL-E Immersive Learning Thematic Analysis Helper. In the hands-on time, participants will try out QUAL-E and develop their ideas for their own qualitative coding ChatGPT. Participants that have the paid ChatGPT Plus subscription can create a draft of their assistants. The organizers will provide course materials and slide deck that participants will be able to utilize to continue development of their custom GPT. The paid subscription to ChatGPT Plus is not required to participate in this workshop, just for trying out personal GPTs during it.

Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...

Travis Hills MN

Travis Hills of Minnesota developed a method to convert waste into high-value dry fertilizer, significantly enriching soil quality. By providing farmers with a valuable resource derived from waste, Travis Hills helps enhance farm profitability while promoting environmental stewardship. Travis Hills' sustainable practices lead to cost savings and increased revenue for farmers by improving resource efficiency and reducing waste.

The binding of cosmological structures by massless topological defects

Sérgio Sacani

Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is mitigated, at least in part.

Deep Software Variability and Frictionless Reproducibility

University of Rennes, INSA Rennes, Inria/IRISA, CNRS

The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions. Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability. Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields. I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating). I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems. Exposé invité Journées Nationales du GDR GPL 2024

Thornton ESPP slides UK WW Network 4_6_24.pdf

European Sustainable Phosphorus Platform

Describing and Interpreting an Immersive Learning Case with the Immersion Cub...

Leonel Morgado

Current descriptions of immersive learning cases are often difficult or impossible to compare. This is due to a myriad of different options on what details to include, which aspects are relevant, and on the descriptive approaches employed. Also, these aspects often combine very specific details with more general guidelines or indicate intents and rationales without clarifying their implementation. In this paper we provide a method to describe immersive learning cases that is structured to enable comparisons, yet flexible enough to allow researchers and practitioners to decide which aspects to include. This method leverages a taxonomy that classifies educational aspects at three levels (uses, practices, and strategies) and then utilizes two frameworks, the Immersive Learning Brain and the Immersion Cube, to enable a structured description and interpretation of immersive learning cases. The method is then demonstrated on a published immersive learning case on training for wind turbine maintenance using virtual reality. Applying the method results in a structured artifact, the Immersive Learning Case Sheet, that tags the case with its proximal uses, practices, and strategies, and refines the free text case description to ensure that matching details are included. This contribution is thus a case description method in support of future comparative research of immersive learning cases. We then discuss how the resulting description and interpretation can be leveraged to change immersion learning cases, by enriching them (considering low-effort changes or additions) or innovating (exploring more challenging avenues of transformation). The method holds significant promise to support better-grounded research in immersive learning.

Recently uploaded (20)

THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...

Immersive Learning That Works: Research Grounding and Paths Forward

Basics of crystallography, crystal systems, classes and different forms

molar-distalization in orthodontics-seminar.pptx

aziz sancar nobel prize winner: from mardin to nobel

Medical Orthopedic PowerPoint Templates.pptx

Micronuclei test.M.sc.zoology.fisheries.

Compexometric titration/Chelatorphy titration/chelating titration

The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx

waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf

The debris of the ‘last major merger’ is dynamically young

Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...

8.Isolation of pure cultures and preservation of cultures.pdf

Equivariant neural networks and representation theory

Authoring a personal GPT for your research and practice: How we created the Q...

Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...

The binding of cosmological structures by massless topological defects

Deep Software Variability and Frictionless Reproducibility

Thornton ESPP slides UK WW Network 4_6_24.pdf

Describing and Interpreting an Immersive Learning Case with the Immersion Cub...

Fine-tuning BERT for Question Answering

3. © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark SQuAD: The Stanford Question Answering Dataset SQuAD v1.1: Single good answer SQuAD v2: One ore more good answers and questions that are impossible to answer

6. Bi-Directional Attention Flow for machine comprehension (BiDAF), Seo et al. 2018

7. (refresher) BERT (Devlin et al. 18): Pre-training INPUT WordPieces Embeddings Sentence Embeddings Position Embeddings Learned during the (pre)training process MASK EMASK In pre-training 15% of the input tokens are masked for the masked LM task There is also a sentence similarity task

8. http://jalammar.github.io/illustrated-bert/ (refresher) BERT (Devlin et al. 18): Pre-training

9. http://jalammar.github.io/illustrated-bert/ (refresher) BERT (Devlin et al. 18): Pre-training

10. Training for Question Answering task: (Modified from: original paper) Fine-tuning Add small weight matrix W: 𝒉 𝐿 𝑖 𝑾 𝑆𝑃𝐴𝑁_𝑆𝑇𝐴𝑅𝑇_𝐸𝑁𝐷 BERT (Devlin et al. 18): Fine-Tuning for QA impossible = start[CLS]+end[CLS] Modified network in Gluon (Bert + Dense Layer) max(𝒔𝒕𝒂𝒓𝒕) + max(𝒆𝒏𝒅) − (𝑠𝑡𝑎𝑟𝑡 𝐶𝐿𝑆 + 𝑒𝑛𝑑 𝐶𝐿𝑆 ) < 𝑖𝑚𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒_𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑 Question is impossible to answer if:

Editor's Notes

First call deck for a high level introduction to Apache MXNet.
Task 1: Mask language model (MLM) Task 2: Next sentence prediction Note that the first token is always forced to be [CLS] — a placeholder that will be used later for prediction in downstream tasks.

Fine-tuning BERT for Question Answering

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Fine-tuning BERT for Question Answering

Similar to Fine-tuning BERT for Question Answering (20)

More from Apache MXNet

More from Apache MXNet (20)

Recently uploaded

Recently uploaded (20)

Fine-tuning BERT for Question Answering

Editor's Notes