SlideShare a Scribd company logo
Prompt Design
LLMs with NER
1. GliNER
2. Large Language Models NER
3. Vector Databases
4. Semantic Searching
5. RAG
Goals
Machine
Learning
● GliNER => A transformer
architecture that allows you to
pass a text and your own
labels to a model without any
training.
Example:
https://huggingface.co/spaces/toma
arsen/gliner_medium-v2.1
Zero-Shot NER
Large Language Models
LLMs
● Contextual Understanding
● Less Manual Effort
● Adaptability
● Improved Accuracy
● Multilingual Capability
Benefits
LLMs
● Resource Intensity (and Cost)
● Data Privacy Concerns
● Black Box Models
● Training Data Bias
● Generalization Challenges
● Latency Issues
● Hallucinations
● Consistency
Limitations
LLMs
● Thinking through your
methodology for NER
● Assisting in certain steps of
NER (RegEx)
● Zero-Shot NER
● Few-Shot NER
How to use LLMs
Mrs. Jessica Monica Kapitan works at the
office. Mrs. Kapitan is a lawyer. She is also
friends with Mrs. Thompson and Miss. Smith.
Sometimes Miss. Smith will miss her train.
Exercise 1: Capture all examples of Miss. and
Mrs. in the text with their corresponding
names using an LLM to generate RegEx
https://regex101.com/r/TLfbGE/1
Exercise 1: One Solution
b(Mrs.|Miss.)s+([A-Z][a-z]*(?:s+[A-Z][a-z]*)*)
Mr. Thomas and Dr. Jessica Davis went to the
store. They met Mrs. Stevens who works at a
nearby office. They are all friends with Colonel
Jackson. Col. Jackson is known to her friends
by her first name, Terry. They all know Mr.
and Mrs. Kapitan.
Exercise 2: Capture all examples [Honorific
Entity] in the text with their corresponding
names using an LLM to generate RegEx
https://regex101.com/r/FYcO8C/1
Exercise 2: One Solution
b(Mr.|Mrs.|Miss.|Dr.|Colonel|Col.)s+([A-Z][a-z]*(?:s+[A-Z][a-z]*)*)
Exercise 3: Use an LLM to identify the people
in the following text. Think through an ethical
way to use an LLM to identify potential women
in these contexts.
Dr. Tracey Jordan works at the Smithsonian where he develops methods to identify named entities. Mrs. Alex Jackson leads the team.
She was trained in machine learning at Stanford. While Tracey functions as the domain expert, Alex Jackson designs the experiments.
They have another colleague, Leslie Peters.
Vector Databases
Representing
Texts
Digitally
Embeddings
● The apple is in the tree.
○ 1-[0.01234, -0.23456, 0.87654,
0.45678, -0.56123, 0.65432,
0.12345, -0.77123, 0.08456,
0.34567, ...]
○ 2-different vector
○ 3-different vector
○ 4-different vector
○ 1-[0.01234, -0.23456, 0.87654,
0.45678, -0.56123, 0.65432,
0.12345, -0.77123, 0.08456,
0.34567, ...]
○ 5-different vector
Vector
Database
What is it?
● It holds vectors in a database
as storage.
● Similar vectors are stored
closer.
Vector
Database
How do we use a vector
database?
● We populate a vector database
with by using a machine
learning model to vectorize
data and send them to the
database.
Vector
Database
Why use a vector database?
Vector
Database
Why use a vector database?
● Vector databases allow users
to store vector data in a way
that allows users to query it
and find similarity based on a
vector-level similarity, rather
than explicit human-defined
similarity.
Vector
Database
What is it?
● A vector database holds
numerous vectors or
embeddings of data.
Sometimes, the database will
also store the original data
alongside these vectors.
Vector Database Stacks
Vector Database Stacks
Vector Database
Stacks
What is available to us?
● Python, Annoy, Streamlit
○ Cheap, easy to deploy, great for
smaller datasets, but requires a
little bit of knowledge to build from
scratch
○ Best for smaller databases (under
10,000 data)
● Python, txtAI
○ Cheap and easy to use, more
resource intensive but easy to
deploy
○ Allows for easy interpretability (via
highlighting)
Multi-Modal
How does it work?
Retrieval-Augmented Generation
How tall is Wookie?
How tall is Wookie?
RAG
What is it?
● RAG allows for you to combine
the strengths of large language
models (LLMs) with vector
databases
● It limits the chances for an LLM
to hallucinate (generate fake
information)
● It uses a vector database to
find relevant material to a
query
RAG
What is it?
● RAG allows for you to combine
the strengths of large language
models (LLMs) with vector
databases
● It limits the chances for an LLM
to hallucinate (generate fake
information)
● It uses a vector database to
find relevant material to a
query
1
2
3
4
5 6

More Related Content

Similar to Mattingly "AI and Prompt Design: LLMs with NER"

10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems
Xavier Amatriain
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf
Xavier Amatriain
 
Software development fundamentals
Software development fundamentalsSoftware development fundamentals
Software development fundamentals
Alfred Jett Grandeza
 
Andrii Belas "Modern approaches to working with categorical data in machine l...
Andrii Belas "Modern approaches to working with categorical data in machine l...Andrii Belas "Modern approaches to working with categorical data in machine l...
Andrii Belas "Modern approaches to working with categorical data in machine l...
Lviv Startup Club
 
Introducción a NLP (Natural Language Processing) en Azure
Introducción a NLP (Natural Language Processing) en AzureIntroducción a NLP (Natural Language Processing) en Azure
Introducción a NLP (Natural Language Processing) en Azure
Plain Concepts
 
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP MeetupDealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Yves Peirsman
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
Awantik Das
 
Deep Learning Design Patterns Study Jams - Overview
Deep Learning Design Patterns Study Jams - OverviewDeep Learning Design Patterns Study Jams - Overview
Deep Learning Design Patterns Study Jams - Overview
Margaret Maynard-Reid
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
Shishir Choudhary
 
Introduction to Data Structures & Algorithms
Introduction to Data Structures & AlgorithmsIntroduction to Data Structures & Algorithms
Introduction to Data Structures & Algorithms
Afaq Mansoor Khan
 
Data science course brochure - Data Science Institute Faridabad
Data science course brochure - Data Science Institute FaridabadData science course brochure - Data Science Institute Faridabad
Data science course brochure - Data Science Institute Faridabad
Vivek Kumar
 
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
Pramati Technologies
 
Entity embeddings for categorical data
Entity embeddings for categorical dataEntity embeddings for categorical data
Entity embeddings for categorical data
Paul Skeie
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
Sanghamitra Deb
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your need
GibDevs
 
Introduction to Text Mining and Topic Modelling
Introduction to Text Mining and Topic ModellingIntroduction to Text Mining and Topic Modelling
Introduction to Text Mining and Topic Modelling
David Paule
 
Global Azure Bootcamp - ML.NET for developers
Global Azure Bootcamp - ML.NET for developersGlobal Azure Bootcamp - ML.NET for developers
Global Azure Bootcamp - ML.NET for developers
Chris Melinn
 
Distributed Deep Learning with Docker at Salesforce
Distributed Deep Learning with Docker at SalesforceDistributed Deep Learning with Docker at Salesforce
Distributed Deep Learning with Docker at Salesforce
Docker, Inc.
 
Introduction
IntroductionIntroduction
Introduction
Preeti Mishra
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Xavier Amatriain
 

Similar to Mattingly "AI and Prompt Design: LLMs with NER" (20)

10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf
 
Software development fundamentals
Software development fundamentalsSoftware development fundamentals
Software development fundamentals
 
Andrii Belas "Modern approaches to working with categorical data in machine l...
Andrii Belas "Modern approaches to working with categorical data in machine l...Andrii Belas "Modern approaches to working with categorical data in machine l...
Andrii Belas "Modern approaches to working with categorical data in machine l...
 
Introducción a NLP (Natural Language Processing) en Azure
Introducción a NLP (Natural Language Processing) en AzureIntroducción a NLP (Natural Language Processing) en Azure
Introducción a NLP (Natural Language Processing) en Azure
 
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP MeetupDealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
 
Deep Learning Design Patterns Study Jams - Overview
Deep Learning Design Patterns Study Jams - OverviewDeep Learning Design Patterns Study Jams - Overview
Deep Learning Design Patterns Study Jams - Overview
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
 
Introduction to Data Structures & Algorithms
Introduction to Data Structures & AlgorithmsIntroduction to Data Structures & Algorithms
Introduction to Data Structures & Algorithms
 
Data science course brochure - Data Science Institute Faridabad
Data science course brochure - Data Science Institute FaridabadData science course brochure - Data Science Institute Faridabad
Data science course brochure - Data Science Institute Faridabad
 
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
Document Clustering using LDA | Haridas Narayanaswamy [Pramati]
 
Entity embeddings for categorical data
Entity embeddings for categorical dataEntity embeddings for categorical data
Entity embeddings for categorical data
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
 
Choosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your needChoosing a Machine Learning technique to solve your need
Choosing a Machine Learning technique to solve your need
 
Introduction to Text Mining and Topic Modelling
Introduction to Text Mining and Topic ModellingIntroduction to Text Mining and Topic Modelling
Introduction to Text Mining and Topic Modelling
 
Global Azure Bootcamp - ML.NET for developers
Global Azure Bootcamp - ML.NET for developersGlobal Azure Bootcamp - ML.NET for developers
Global Azure Bootcamp - ML.NET for developers
 
Distributed Deep Learning with Docker at Salesforce
Distributed Deep Learning with Docker at SalesforceDistributed Deep Learning with Docker at Salesforce
Distributed Deep Learning with Docker at Salesforce
 
Introduction
IntroductionIntroduction
Introduction
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 

More from National Information Standards Organization (NISO)

Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
National Information Standards Organization (NISO)
 
Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"
National Information Standards Organization (NISO)
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
National Information Standards Organization (NISO)
 
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
National Information Standards Organization (NISO)
 
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
National Information Standards Organization (NISO)
 
Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"
National Information Standards Organization (NISO)
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
National Information Standards Organization (NISO)
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
National Information Standards Organization (NISO)
 
Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
National Information Standards Organization (NISO)
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
National Information Standards Organization (NISO)
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
National Information Standards Organization (NISO)
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
National Information Standards Organization (NISO)
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
National Information Standards Organization (NISO)
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
National Information Standards Organization (NISO)
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
National Information Standards Organization (NISO)
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
National Information Standards Organization (NISO)
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
National Information Standards Organization (NISO)
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
National Information Standards Organization (NISO)
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
National Information Standards Organization (NISO)
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
National Information Standards Organization (NISO)
 

More from National Information Standards Organization (NISO) (20)

Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
 
Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
 
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
Mattingly "AI & Prompt Design: Limitations and Solutions with LLMs"
 
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
Mattingly "AI and Prompt Design: LLMs with Text Classification and Open Source"
 
Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"Mattingly "AI & Prompt Design: Named Entity Recognition"
Mattingly "AI & Prompt Design: Named Entity Recognition"
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
 

Recently uploaded

South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
Jyoti Chand
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
สมใจ จันสุกสี
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience
Wahiba Chair Training & Consulting
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
Katrina Pritchard
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
Himanshu Rai
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
eBook.com.bd (প্রয়োজনীয় বাংলা বই)
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
siemaillard
 
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Diana Rendina
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
adhitya5119
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Denish Jangid
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 

Recently uploaded (20)

South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
 
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem studentsRHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
RHEOLOGY Physical pharmaceutics-II notes for B.pharm 4th sem students
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
 
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
 
Chapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptxChapter wise All Notes of First year Basic Civil Engineering.pptx
Chapter wise All Notes of First year Basic Civil Engineering.pptx
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 

Mattingly "AI and Prompt Design: LLMs with NER"

  • 2. 1. GliNER 2. Large Language Models NER 3. Vector Databases 4. Semantic Searching 5. RAG Goals
  • 3. Machine Learning ● GliNER => A transformer architecture that allows you to pass a text and your own labels to a model without any training. Example: https://huggingface.co/spaces/toma arsen/gliner_medium-v2.1 Zero-Shot NER
  • 5. LLMs ● Contextual Understanding ● Less Manual Effort ● Adaptability ● Improved Accuracy ● Multilingual Capability Benefits
  • 6. LLMs ● Resource Intensity (and Cost) ● Data Privacy Concerns ● Black Box Models ● Training Data Bias ● Generalization Challenges ● Latency Issues ● Hallucinations ● Consistency Limitations
  • 7. LLMs ● Thinking through your methodology for NER ● Assisting in certain steps of NER (RegEx) ● Zero-Shot NER ● Few-Shot NER How to use LLMs
  • 8. Mrs. Jessica Monica Kapitan works at the office. Mrs. Kapitan is a lawyer. She is also friends with Mrs. Thompson and Miss. Smith. Sometimes Miss. Smith will miss her train.
  • 9. Exercise 1: Capture all examples of Miss. and Mrs. in the text with their corresponding names using an LLM to generate RegEx https://regex101.com/r/TLfbGE/1
  • 10. Exercise 1: One Solution b(Mrs.|Miss.)s+([A-Z][a-z]*(?:s+[A-Z][a-z]*)*)
  • 11. Mr. Thomas and Dr. Jessica Davis went to the store. They met Mrs. Stevens who works at a nearby office. They are all friends with Colonel Jackson. Col. Jackson is known to her friends by her first name, Terry. They all know Mr. and Mrs. Kapitan.
  • 12. Exercise 2: Capture all examples [Honorific Entity] in the text with their corresponding names using an LLM to generate RegEx https://regex101.com/r/FYcO8C/1
  • 13. Exercise 2: One Solution b(Mr.|Mrs.|Miss.|Dr.|Colonel|Col.)s+([A-Z][a-z]*(?:s+[A-Z][a-z]*)*)
  • 14. Exercise 3: Use an LLM to identify the people in the following text. Think through an ethical way to use an LLM to identify potential women in these contexts. Dr. Tracey Jordan works at the Smithsonian where he develops methods to identify named entities. Mrs. Alex Jackson leads the team. She was trained in machine learning at Stanford. While Tracey functions as the domain expert, Alex Jackson designs the experiments. They have another colleague, Leslie Peters.
  • 16. Representing Texts Digitally Embeddings ● The apple is in the tree. ○ 1-[0.01234, -0.23456, 0.87654, 0.45678, -0.56123, 0.65432, 0.12345, -0.77123, 0.08456, 0.34567, ...] ○ 2-different vector ○ 3-different vector ○ 4-different vector ○ 1-[0.01234, -0.23456, 0.87654, 0.45678, -0.56123, 0.65432, 0.12345, -0.77123, 0.08456, 0.34567, ...] ○ 5-different vector
  • 17. Vector Database What is it? ● It holds vectors in a database as storage. ● Similar vectors are stored closer.
  • 18.
  • 19. Vector Database How do we use a vector database? ● We populate a vector database with by using a machine learning model to vectorize data and send them to the database.
  • 20. Vector Database Why use a vector database?
  • 21. Vector Database Why use a vector database? ● Vector databases allow users to store vector data in a way that allows users to query it and find similarity based on a vector-level similarity, rather than explicit human-defined similarity.
  • 22. Vector Database What is it? ● A vector database holds numerous vectors or embeddings of data. Sometimes, the database will also store the original data alongside these vectors.
  • 25. Vector Database Stacks What is available to us? ● Python, Annoy, Streamlit ○ Cheap, easy to deploy, great for smaller datasets, but requires a little bit of knowledge to build from scratch ○ Best for smaller databases (under 10,000 data) ● Python, txtAI ○ Cheap and easy to use, more resource intensive but easy to deploy ○ Allows for easy interpretability (via highlighting)
  • 28. How tall is Wookie?
  • 29.
  • 30. How tall is Wookie?
  • 31. RAG What is it? ● RAG allows for you to combine the strengths of large language models (LLMs) with vector databases ● It limits the chances for an LLM to hallucinate (generate fake information) ● It uses a vector database to find relevant material to a query
  • 32. RAG What is it? ● RAG allows for you to combine the strengths of large language models (LLMs) with vector databases ● It limits the chances for an LLM to hallucinate (generate fake information) ● It uses a vector database to find relevant material to a query 1 2 3 4 5 6