Mattingly "Text and Data Mining: Building Data Driven Applications"

•Download as PPTX, PDF•

0 likes•156 views

National Information Standards Organization (NISO)

This presentation was provided by William Mattingly of the Smithsonian Institution, for the eight and final session of NISO's 2023 Training Series on Text and Data Mining. Session eight, "Building Data Driven Applications" was held on Thursday, December 7, 2023.

Education

Text and Data Mining
Building Data-Driven Apps

1. Refresher on Vector Databases
2. Large Language Models
3. Retrieval-Augmented Generation (RAG)
4. Streamlit
5. Gradio
6. R Shiny
Goals

Large Language
Models
What are they?
● Large language models (LLMs)
like GPT-4 are advanced AI
systems trained on extensive
datasets to understand and
generate human-like text.

Large Language
Models
What are they?
● Natural Language
Understanding (NLU)
● Natural Language Generation
(NLG)

Large Language
Models
Training
● LLMs are trained on vast
amounts of text data from the
internet, books, and other
sources. This training enables
them to learn language
patterns, context, and various
knowledge domains.

Large Language
Models
Where do they excel?
● Generating new data based on
their knowledge of existing
data.
○ Code
○ Essays
○ Images

Large Language
Models
Limitations
● Hallucinations - Generating
incorrect data
● Ethics and Biases
● Copyright Infringement

RAG
What is it?
● RAG allows for you to combine
the strengths of large language
models (LLMs) with vector
databases
● It limits the chances for an LLM
to hallucinate (generate fake
information)
● It uses a vector database to
find relevant material to a
query

Similar to Mattingly "Text and Data Mining: Building Data Driven Applications"

leewayhertz.com-How to build a GPT model (1).pdfKristiLBurns

ELKL 5 Language documentation for linguistics and technologyDafydd Gibbon

GATE: a text analysis tool for social mediaDiana Maynard

Train foundation model for domain-specific language modelBenjaminlapid1

NLP Workshop Presentation at Universitat de BarcelonaSergiPons5

Tools for (Almost) Real-Time Social Media AnalysisDiana Maynard

NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA DATASCIENCE

Chatty GraPeTree_CV_2023.pdfJames Spalding

A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3AIRCC Publishing Corporation

_The Rise of Google Bard AI_ Unleashing the Power of Language_ 3.pdfBardAi2

Pemanfaatan Big Data Dalam Riset 2023.pptxelisarosa29

GDSC career guide presentation.pptxDishaSharma737984

GDSC career guide presentation.pptxAryanSharma853911

Babak Rasolzadeh: The importance of entitiesZoltan Varju

Presentation 4 1 REDY ok.pptxSambitkumarBarik2

Applications of Large Language Models in Materials Discovery and DesignAnubhav Jain

Dealing with Data Scarcity in Natural Language Processing - Belgium NLP MeetupYves Peirsman

160930-artificial-intelligence-template-4x3.pptxBehzad74

Introducing The Big6Big6 Associates, LLC

clojure.pptxMakhanChor2

Similar to Mattingly "Text and Data Mining: Building Data Driven Applications" (20)

leewayhertz.com-How to build a GPT model (1).pdf

ELKL 5 Language documentation for linguistics and technology

GATE: a text analysis tool for social media

Train foundation model for domain-specific language model

NLP Workshop Presentation at Universitat de Barcelona

Tools for (Almost) Real-Time Social Media Analysis

NOVA Data Science Meetup 1/19/2017 - Presentation 2

Chatty GraPeTree_CV_2023.pdf

A Comparative Study of Text Comprehension in IELTS Reading Exam using GPT-3

_The Rise of Google Bard AI_ Unleashing the Power of Language_ 3.pdf

Pemanfaatan Big Data Dalam Riset 2023.pptx

GDSC career guide presentation.pptx

Babak Rasolzadeh: The importance of entities

Presentation 4 1 REDY ok.pptx

Applications of Large Language Models in Materials Discovery and Design

Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup

160930-artificial-intelligence-template-4x3.pptx

Introducing The Big6

clojure.pptx

More from National Information Standards Organization (NISO)

Bazargan "NISO Webinar, Sustainability in Publishing"National Information Standards Organization (NISO)

Rapple "Scholarly Communications and the Sustainable Development Goals"National Information Standards Organization (NISO)

Compton "NISO Webinar, Sustainability in Publishing"National Information Standards Organization (NISO)

Mattingly "AI & Prompt Design: Large Language Models"National Information Standards Organization (NISO)

Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...National Information Standards Organization (NISO)

Mattingly "AI & Prompt Design" - Introduction to Machine Learning"National Information Standards Organization (NISO)

Mattingly "Text and Data Mining: Searching Vectors"National Information Standards Organization (NISO)

Mattingly "Text Mining Techniques"National Information Standards Organization (NISO)

Mattingly "Text Processing for Library Data: Representing Text as Data"National Information Standards Organization (NISO)

Carpenter "Designing NISO's New Strategic Plan: 2023-2026"National Information Standards Organization (NISO)

Ross and Clark "Strategic Planning"National Information Standards Organization (NISO)

Mattingly "Data Mining Techniques: Classification and Clustering"National Information Standards Organization (NISO)

Straza "Global collaboration towards equitable and open science: UNESCO Recom...National Information Standards Organization (NISO)

Lippincott "Beyond access: Accelerating discovery and increasing trust throug...National Information Standards Organization (NISO)

Kriegsman "Integrating Open and Equitable Research into Open Science"National Information Standards Organization (NISO)

Mattingly "Ethics and Cleaning Data"National Information Standards Organization (NISO)

Mercado-Lara "Open & Equitable Program"National Information Standards Organization (NISO)

Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"National Information Standards Organization (NISO)

Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"National Information Standards Organization (NISO)

Hahnel “Mapping Progress: Reflections and Charting Future Pathways"National Information Standards Organization (NISO)

More from National Information Standards Organization (NISO) (20)

Bazargan "NISO Webinar, Sustainability in Publishing"

Rapple "Scholarly Communications and the Sustainable Development Goals"

Compton "NISO Webinar, Sustainability in Publishing"

Mattingly "AI & Prompt Design: Large Language Models"

Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...

Mattingly "AI & Prompt Design" - Introduction to Machine Learning"

Mattingly "Text and Data Mining: Searching Vectors"

Mattingly "Text Mining Techniques"

Mattingly "Text Processing for Library Data: Representing Text as Data"

Carpenter "Designing NISO's New Strategic Plan: 2023-2026"

Ross and Clark "Strategic Planning"

Mattingly "Data Mining Techniques: Classification and Clustering"

Straza "Global collaboration towards equitable and open science: UNESCO Recom...

Lippincott "Beyond access: Accelerating discovery and increasing trust throug...

Kriegsman "Integrating Open and Equitable Research into Open Science"

Mattingly "Ethics and Cleaning Data"

Mercado-Lara "Open & Equitable Program"

Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"

Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"

Hahnel “Mapping Progress: Reflections and Charting Future Pathways"

Recently uploaded

Crayon Activity Handout For the Crayon AUnboundStockton

Earth Day Presentation wow hello nice greatYousafMalik24

Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝9953056974 Low Rate Call Girls In Saket, Delhi NCR

Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝9953056974 Low Rate Call Girls In Saket, Delhi NCR

AmericanHighSchoolsprezentacijaoskolama.arsicmarija21

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR

Computed Fields and api Depends in the Odoo 17Celine George

ROOT CAUSE ANALYSIS PowerPoint PresentationAadityaSharma884161

Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2

OS-operating systems- ch04 (Threads) ...Dr. Mazin Mohamed alkathiri

Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc

How to do quick user assign in kanban in Odoo 17 ERPCeline George

Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN

EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3

Atmosphere science 7 quarter 4 .........LeaCamillePacle

Roles & Responsibilities in PharmacovigilanceSamikshaHamane

Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1

Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo

How to Configure Email Server in Odoo 17Celine George

Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe

Recently uploaded (20)

Crayon Activity Handout For the Crayon A

Earth Day Presentation wow hello nice great

Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝

Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝

AmericanHighSchoolsprezentacijaoskolama.

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️

Computed Fields and api Depends in the Odoo 17

ROOT CAUSE ANALYSIS PowerPoint Presentation

Grade 9 Q4-MELC1-Active and Passive Voice.pptx

OS-operating systems- ch04 (Threads) ...

Procuring digital preservation CAN be quick and painless with our new dynamic...

How to do quick user assign in kanban in Odoo 17 ERP

Solving Puzzles Benefits Everyone (English).pptx

EPANDING THE CONTENT OF AN OUTLINE using notes.pptx

Atmosphere science 7 quarter 4 .........

Roles & Responsibilities in Pharmacovigilance

Employee wellbeing at the workplace.pptx

Quarter 4 Peace-education.pptx Catch Up Friday

How to Configure Email Server in Odoo 17

Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf

Mattingly "Text and Data Mining: Building Data Driven Applications"

1. Text and Data Mining Building Data-Driven Apps

2. 1. Refresher on Vector Databases 2. Large Language Models 3. Retrieval-Augmented Generation (RAG) 4. Streamlit 5. Gradio 6. R Shiny Goals

3. Multi-Modal How does it work?

4. Large Language Models

5. Large Language Models What are they? ● Large language models (LLMs) like GPT-4 are advanced AI systems trained on extensive datasets to understand and generate human-like text.

6. Large Language Models What are they? ● Natural Language Understanding (NLU) ● Natural Language Generation (NLG)

7. Large Language Models What are they? ● Natural Language Understanding (NLU) ● Natural Language Generation (NLG)

8. Large Language Models Training ● LLMs are trained on vast amounts of text data from the internet, books, and other sources. This training enables them to learn language patterns, context, and various knowledge domains.

9. Large Language Models Where do they excel? ● Generating new data based on their knowledge of existing data. ○ Code ○ Essays ○ Images

10. Large Language Models Limitations ● Hallucinations - Generating incorrect data ● Ethics and Biases ● Copyright Infringement

11. Retrieval-Augmented Generation

12. How tall is Wookie?

13.

14. How tall is Wookie?

15. RAG What is it? ● RAG allows for you to combine the strengths of large language models (LLMs) with vector databases ● It limits the chances for an LLM to hallucinate (generate fake information) ● It uses a vector database to find relevant material to a query

16. RAG What is it? ● RAG allows for you to combine the strengths of large language models (LLMs) with vector databases ● It limits the chances for an LLM to hallucinate (generate fake information) ● It uses a vector database to find relevant material to a query 1 2 3 4 5 6

Mattingly "Text and Data Mining: Building Data Driven Applications"

Recommended

Recommended

More Related Content

Similar to Mattingly "Text and Data Mining: Building Data Driven Applications"

Similar to Mattingly "Text and Data Mining: Building Data Driven Applications" (20)

More from National Information Standards Organization (NISO)

More from National Information Standards Organization (NISO) (20)

Recently uploaded

Recently uploaded (20)

Mattingly "Text and Data Mining: Building Data Driven Applications"