SlideShare a Scribd company logo
1 of 86
CS60057 Speech &Natural Language Processing Autumn 2007 Lecture 5 2 August 2007
WORDS The Building Blocks of Language
[object Object],[object Object]
Tokens, Types and Texts ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Extracting text from the Web ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Extracting text from NLTK Corpora ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Brown Corpus ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
 
Corpus Linguistics ,[object Object],[object Object],[object Object],[object Object],[object Object]
What’s a word? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Another example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Some Useful Empirical Observations ,[object Object],[object Object],[object Object],[object Object],[object Object]
Common words in  Tom Sawyer but words in NL have an uneven distribution…
Text properties (formalized) Sample word frequency data
Frequency of frequencies ,[object Object],[object Object],[object Object],[object Object],[object Object]
Zipf’s Law ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Zipf’s Law ,[object Object]
Zipf curve
Predicting Occurrence Frequencies ,[object Object],[object Object],[object Object],[object Object],Fraction of words with frequency  n  is: Fraction  of words appearing only once is therefore ½.
Explanations for Zipf’s Law ,[object Object]
Zipf’s First Law ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Zipf’s Second Law ,[object Object],[object Object],[object Object]
Zipf’s Third Law ,[object Object],[object Object],[object Object]
Zipf’s Law Impact on Language Analysis ,[object Object],[object Object]
Vocabulary Growth ,[object Object],[object Object],[object Object]
Heaps’ Law ,[object Object],[object Object],[object Object],[object Object]
Heaps’ Law Data
Word counts are interesting... ,[object Object],[object Object],[object Object],[object Object],[object Object]
Zipf’s Law on Tom Saywer ,[object Object],[object Object],[object Object],[object Object]
Plot of Zipf’s Law ,[object Object],[object Object]
Plot of Zipf’s Law (con’t) ,[object Object],[object Object]
Zipf’s Law, so what? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
N-Grams and Corpus Linguistics
A bad language model N-grams & Language Modeling
A bad language model
A bad language model Herman is reprinted with permission from LaughingStock Licensing Inc., Ottawa Canada.  All rights reserved.
What’s a Language Model ,[object Object],[object Object],[object Object]
What’s a language model for? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Next Word Prediction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object]
Human Word Prediction ,[object Object],[object Object],[object Object],[object Object],[object Object]
Claim ,[object Object],[object Object]
Applications ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Overview ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Simple N-Grams ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
N-grams ,[object Object],[object Object],[object Object],[object Object]
Computing the Probability of a Word Sequence ,[object Object],[object Object],[object Object],[object Object],[object Object]
Bigram Model ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Using N-Grams ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The  n -gram Approximation ,[object Object],[object Object],[object Object],[object Object],[object Object]
n- grams, continued ,[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object]
N-grams for Language Generation ,[object Object],Unigram: 5. …Here words are chosen independently but with their appropriate frequencies. REPRESENTING AND SPEEDILY IS AN GOOD APT OR COME CAN DIFFERENT NATURAL HERE HE THE A IN CAME THE TO OF TO EXPERT GRAY COME TO FURNISHES THE LINE MESSAGE HAD BE THESE. Bigram: 6. Second-order word approximation. The word transition probabilities are correct but no further structure is included. THE HEAD AND IN FRONTAL ATTACK ON AN ENGLISH WRITER THAT THE CHARACTER OF THIS POINT IS THEREFORE ANOTHER METHOD FOR THE LETTERS THAT THE TIME OF WHO EVER TOLD THE PROBLEM FOR AN UNEXPECTED.
N-Gram Models of Language ,[object Object],[object Object],[object Object],[object Object],[object Object]
Counting Words in Corpora ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Terminology ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Corpora ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Simple N-Grams ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Computing the Probability of a Word Sequence ,[object Object],[object Object],[object Object],[object Object],[object Object]
Bigram Model ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Using N-Grams ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Training and Testing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
A Simple Example ,[object Object],[object Object]
A Bigram Grammar Fragment from BERP .001 Eat British .03 Eat today .007 Eat dessert .04 Eat Indian .01 Eat tomorrow .04 Eat a .02 Eat Mexican .04 Eat at .02 Eat Chinese .05 Eat dinner .02 Eat in .06 Eat lunch .03 Eat breakfast .06 Eat some .03 Eat Thai .16 Eat on
.01 British lunch .05 Want a .01 British cuisine .65 Want to .15 British restaurant .04 I have .60 British food .08 I don’t .02 To be .29 I would .09 To spend .32 I want .14 To have .02 <start> I’m .26 To eat .04 <start> Tell .01 Want Thai .06 <start> I’d .04 Want some .25 <start> I
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
BERP Bigram Counts 0 1 0 0 0 0 4 Lunch 0 0 0 0 17 0 19 Food 1 120 0 0 0 0 2 Chinese 52 2 19 0 2 0 0 Eat 12 0 3 860 10 0 3 To 6 8 6 0 786 0 3 Want 0 0 0 13 0 1087 8 I lunch Food Chinese Eat To Want I
BERP Bigram Probabilities ,[object Object],[object Object],[object Object],[object Object],[object Object],459 1506 213 938 3256 1215 3437 Lunch Food Chinese Eat To Want I
What do we learn about the language? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object]
Approximating Shakespeare ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object]
N-Gram Training Sensitivity ,[object Object],[object Object]
Some Useful Empirical Observations ,[object Object],[object Object],[object Object],[object Object],[object Object]
Smoothing Techniques ,[object Object],[object Object],[object Object]
Smoothing Techniques ,[object Object],[object Object],[object Object]
Add-one Smoothing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],Witten-Bell Discounting
[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Good-Turing Discounting
Backoff methods (e.g. Katz ‘87) ,[object Object],[object Object],[object Object],[object Object],[object Object]
Summary ,[object Object],[object Object],[object Object],[object Object]

More Related Content

What's hot

SDLC and Software Process Models
SDLC and Software Process ModelsSDLC and Software Process Models
SDLC and Software Process ModelsNana Sarpong
 
Rational Unified Process
Rational Unified ProcessRational Unified Process
Rational Unified ProcessKumar
 
Software System Engineering - Chapter 1
Software System Engineering - Chapter 1Software System Engineering - Chapter 1
Software System Engineering - Chapter 1Fadhil Ismail
 
Git and Github slides.pdf
Git and Github slides.pdfGit and Github slides.pdf
Git and Github slides.pdfTilton2
 
software project management Waterfall model
software project management Waterfall modelsoftware project management Waterfall model
software project management Waterfall modelREHMAT ULLAH
 
Software estimation
Software estimationSoftware estimation
Software estimationMd Shakir
 
Servlet and servlet life cycle
Servlet and servlet life cycleServlet and servlet life cycle
Servlet and servlet life cycleDhruvin Nakrani
 
Evolutionary process models se.ppt
Evolutionary process models se.pptEvolutionary process models se.ppt
Evolutionary process models se.pptbhadjaashvini1
 
MG6088 SOFTWARE PROJECT MANAGEMENT
MG6088 SOFTWARE PROJECT MANAGEMENTMG6088 SOFTWARE PROJECT MANAGEMENT
MG6088 SOFTWARE PROJECT MANAGEMENTKathirvel Ayyaswamy
 
Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Brian Brazil
 
6 shallow parsing introduction
6 shallow parsing introduction6 shallow parsing introduction
6 shallow parsing introductionThennarasuSakkan
 
Making Of PHP Based Web Application
Making Of PHP Based Web ApplicationMaking Of PHP Based Web Application
Making Of PHP Based Web ApplicationSachin Walvekar
 

What's hot (20)

Xml parsers
Xml parsersXml parsers
Xml parsers
 
Ngrams smoothing
Ngrams smoothingNgrams smoothing
Ngrams smoothing
 
SDLC and Software Process Models
SDLC and Software Process ModelsSDLC and Software Process Models
SDLC and Software Process Models
 
Rational Unified Process
Rational Unified ProcessRational Unified Process
Rational Unified Process
 
Software System Engineering - Chapter 1
Software System Engineering - Chapter 1Software System Engineering - Chapter 1
Software System Engineering - Chapter 1
 
Kafka Deep Dive
Kafka Deep DiveKafka Deep Dive
Kafka Deep Dive
 
Agile model
Agile modelAgile model
Agile model
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Git and Github slides.pdf
Git and Github slides.pdfGit and Github slides.pdf
Git and Github slides.pdf
 
software project management Waterfall model
software project management Waterfall modelsoftware project management Waterfall model
software project management Waterfall model
 
Grafana 7.0
Grafana 7.0Grafana 7.0
Grafana 7.0
 
Software estimation
Software estimationSoftware estimation
Software estimation
 
InfluxDB & Grafana
InfluxDB & GrafanaInfluxDB & Grafana
InfluxDB & Grafana
 
Servlet and servlet life cycle
Servlet and servlet life cycleServlet and servlet life cycle
Servlet and servlet life cycle
 
Evolutionary process models se.ppt
Evolutionary process models se.pptEvolutionary process models se.ppt
Evolutionary process models se.ppt
 
MG6088 SOFTWARE PROJECT MANAGEMENT
MG6088 SOFTWARE PROJECT MANAGEMENTMG6088 SOFTWARE PROJECT MANAGEMENT
MG6088 SOFTWARE PROJECT MANAGEMENT
 
Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)Prometheus and Docker (Docker Galway, November 2015)
Prometheus and Docker (Docker Galway, November 2015)
 
6 shallow parsing introduction
6 shallow parsing introduction6 shallow parsing introduction
6 shallow parsing introduction
 
Agile software development
Agile software developmentAgile software development
Agile software development
 
Making Of PHP Based Web Application
Making Of PHP Based Web ApplicationMaking Of PHP Based Web Application
Making Of PHP Based Web Application
 

Similar to NLP new words

Chapter 2: Text Operation in information stroage and retrieval
Chapter 2: Text Operation in information stroage and retrievalChapter 2: Text Operation in information stroage and retrieval
Chapter 2: Text Operation in information stroage and retrievalcaptainmactavish1996
 
Lecture 7- Text Statistics and Document Parsing
Lecture 7- Text Statistics and Document ParsingLecture 7- Text Statistics and Document Parsing
Lecture 7- Text Statistics and Document ParsingSean Golliher
 
crypto_graphy_PPTs.pdf
crypto_graphy_PPTs.pdfcrypto_graphy_PPTs.pdf
crypto_graphy_PPTs.pdfMajidMumtaz3
 
Chapter 2 Text Operation.pdf
Chapter 2 Text Operation.pdfChapter 2 Text Operation.pdf
Chapter 2 Text Operation.pdfHabtamu100
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguisticsshrey bhate
 
Coms30123 Synthesis 3 Projector
Coms30123 Synthesis 3 ProjectorComs30123 Synthesis 3 Projector
Coms30123 Synthesis 3 ProjectorDr. Cupid Lucid
 
Chapter 2 Text Operation and Term Weighting.pdf
Chapter 2 Text Operation and Term Weighting.pdfChapter 2 Text Operation and Term Weighting.pdf
Chapter 2 Text Operation and Term Weighting.pdfJemalNesre1
 
Stemming algorithms
Stemming algorithmsStemming algorithms
Stemming algorithmsRaghu nath
 
NLTK: Natural Language Processing made easy
NLTK: Natural Language Processing made easyNLTK: Natural Language Processing made easy
NLTK: Natural Language Processing made easyoutsider2
 
Natural Language Processing made easy
Natural Language Processing made easyNatural Language Processing made easy
Natural Language Processing made easyGopi Krishnan Nambiar
 
Natural Language parsing.pptx
Natural Language parsing.pptxNatural Language parsing.pptx
Natural Language parsing.pptxsiddhantroy13
 
ToC_M1L3_Grammar and Derivation.pdf
ToC_M1L3_Grammar and Derivation.pdfToC_M1L3_Grammar and Derivation.pdf
ToC_M1L3_Grammar and Derivation.pdfjaishreemane73
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Mustafa Jarrar
 
NLP_guest_lecture.pdf
NLP_guest_lecture.pdfNLP_guest_lecture.pdf
NLP_guest_lecture.pdfSoha82
 
Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Rajnish Raj
 

Similar to NLP new words (20)

Chapter 2: Text Operation in information stroage and retrieval
Chapter 2: Text Operation in information stroage and retrievalChapter 2: Text Operation in information stroage and retrieval
Chapter 2: Text Operation in information stroage and retrieval
 
Lecture 7- Text Statistics and Document Parsing
Lecture 7- Text Statistics and Document ParsingLecture 7- Text Statistics and Document Parsing
Lecture 7- Text Statistics and Document Parsing
 
crypto_graphy_PPTs.pdf
crypto_graphy_PPTs.pdfcrypto_graphy_PPTs.pdf
crypto_graphy_PPTs.pdf
 
Chapter 2 Text Operation.pdf
Chapter 2 Text Operation.pdfChapter 2 Text Operation.pdf
Chapter 2 Text Operation.pdf
 
Computational linguistics
Computational linguisticsComputational linguistics
Computational linguistics
 
Coms30123 Synthesis 3 Projector
Coms30123 Synthesis 3 ProjectorComs30123 Synthesis 3 Projector
Coms30123 Synthesis 3 Projector
 
Introduction to linguistics
Introduction to linguisticsIntroduction to linguistics
Introduction to linguistics
 
Chapter 2 Text Operation and Term Weighting.pdf
Chapter 2 Text Operation and Term Weighting.pdfChapter 2 Text Operation and Term Weighting.pdf
Chapter 2 Text Operation and Term Weighting.pdf
 
OpenNLP demo
OpenNLP demoOpenNLP demo
OpenNLP demo
 
Ir 03
Ir   03Ir   03
Ir 03
 
Stemming algorithms
Stemming algorithmsStemming algorithms
Stemming algorithms
 
NLTK: Natural Language Processing made easy
NLTK: Natural Language Processing made easyNLTK: Natural Language Processing made easy
NLTK: Natural Language Processing made easy
 
Natural Language Processing made easy
Natural Language Processing made easyNatural Language Processing made easy
Natural Language Processing made easy
 
Linguistics
LinguisticsLinguistics
Linguistics
 
Natural Language parsing.pptx
Natural Language parsing.pptxNatural Language parsing.pptx
Natural Language parsing.pptx
 
ToC_M1L3_Grammar and Derivation.pdf
ToC_M1L3_Grammar and Derivation.pdfToC_M1L3_Grammar and Derivation.pdf
ToC_M1L3_Grammar and Derivation.pdf
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
NLP_guest_lecture.pdf
NLP_guest_lecture.pdfNLP_guest_lecture.pdf
NLP_guest_lecture.pdf
 
Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...Natural Language processing Parts of speech tagging, its classes, and how to ...
Natural Language processing Parts of speech tagging, its classes, and how to ...
 
Nlp
NlpNlp
Nlp
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

NLP new words

  • 1. CS60057 Speech &Natural Language Processing Autumn 2007 Lecture 5 2 August 2007
  • 2. WORDS The Building Blocks of Language
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.  
  • 11.
  • 12.
  • 13.
  • 14.
  • 15. Common words in Tom Sawyer but words in NL have an uneven distribution…
  • 16. Text properties (formalized) Sample word frequency data
  • 17.
  • 18.
  • 19.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35. N-Grams and Corpus Linguistics
  • 36. A bad language model N-grams & Language Modeling
  • 37. A bad language model
  • 38. A bad language model Herman is reprinted with permission from LaughingStock Licensing Inc., Ottawa Canada. All rights reserved.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66. A Bigram Grammar Fragment from BERP .001 Eat British .03 Eat today .007 Eat dessert .04 Eat Indian .01 Eat tomorrow .04 Eat a .02 Eat Mexican .04 Eat at .02 Eat Chinese .05 Eat dinner .02 Eat in .06 Eat lunch .03 Eat breakfast .06 Eat some .03 Eat Thai .16 Eat on
  • 67. .01 British lunch .05 Want a .01 British cuisine .65 Want to .15 British restaurant .04 I have .60 British food .08 I don’t .02 To be .29 I would .09 To spend .32 I want .14 To have .02 <start> I’m .26 To eat .04 <start> Tell .01 Want Thai .06 <start> I’d .04 Want some .25 <start> I
  • 68.
  • 69. BERP Bigram Counts 0 1 0 0 0 0 4 Lunch 0 0 0 0 17 0 19 Food 1 120 0 0 0 0 2 Chinese 52 2 19 0 2 0 0 Eat 12 0 3 860 10 0 3 To 6 8 6 0 786 0 3 Want 0 0 0 13 0 1087 8 I lunch Food Chinese Eat To Want I
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.
  • 80.
  • 81.
  • 82.
  • 83.
  • 84.
  • 85.
  • 86.