SlideShare a Scribd company logo
1 of 62
Language Technology (A Machine Learning Approach) Antal van den Bosch & Walter Daelemans [email_address] [email_address] http://ilk.uvt.nl/~antalb/ltua
Programme ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Evaluation ,[object Object],[object Object]
Language Technology overview
Text Meaning LT Components Applications Lexical / Morphological Analysis Syntactic Analysis Semantic Analysis Discourse Analysis Tagging Chunking Word Sense Disambiguation Grammatical Relation Finding Named Entity Recognition Reference Resolution OCR Spelling Error Correction Grammar Checking Information retrieval Information Extraction Summarization Machine Translation Document Classification Ontology Extraction and Refinement Question Answering Dialogue Systems
Text Representation Units ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Text is a special kind of data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Meaning Speech Text Speech recognition Language  comprehension Text generation Speech synthesis translation
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
How to reach Language Understanding ? ,[object Object],[object Object],[object Object]
What is meaning ? Eleni eats a pizza with banana Pizza = {p1, p2, p3, …} Eat ={<Nicolas,p1>,<Nicolas,p3>,<Eleni,p2>,…} Contain ={<p1,ansjovis>,<p1,tomaat>,<p2,banaan>,…} x=p2 Semantic networks, Frames First-order predicate calculus Set theory Eleni banana pizza eat contain  (x): pizza(x)   eat (Eleni,x)   contain  (x, banaan ) “ Symbol grounding” problem Representation and processing of time, causality, modality, defaults, common sense, … “ Meaning is in the mind of the beholder”
Language Technology Language Data Model Input Representation Output Representation  Acquisition  Processing
Deductive Route Language  Data Model Input  Representation Output Representation  S -> NP VP NP -> ART A N NP -> ART N VP -> V NP ...
Deductive Route ,[object Object],[object Object],[object Object],[object Object]
Inductive Route Language  Data Model Input  Representation Output  Representation   De computertaalkunde, in navolging van de taalkunde aanvankelijk sterk regel-gebaseerd, is onder druk van toepasbaarheid en grotere rekenkracht de laatste tien jaar geleidelijk geevolueerd naar een meer statistische, corpus-gebaseerde en inductieve benadering. De laatste jaren hebben ook technieken uit de theorie van zelflerende systemen (Machine Learning, zie Mitchell, 1998 voor een inleiding) aan belang gewonnen. Deze technieken zijn in zekere zin p(corpustaalkunde|de), p(in|corpustaalkunde), p(corpustaalkunde), p(de), p(in), ...
Inductive Route ,[object Object],[object Object],[object Object],[object Object]
Advantages ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Problems ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Text Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Document Set of Documents Author Recognition Document Dating Language Identification Text Categorization Information Extraction Summarization Question Answering Topic Detection and Tracking Document Clustering Terminology Extraction Ontology Extraction Structured Information Knowledge Discovery + existing data
Information Extraction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Example: MUC-terrorisme ,[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
IEX System Architecture ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Question Answering ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
QA System: Shapaqa ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Shapaqa: example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Shapaqa: example (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
- Silence -
Empiricism, analogy, induction, language ,[object Object],[object Object],[object Object]
Lightweight history (2) ,[object Object],[object Object],[object Object],[object Object]
Lightweight history (3) ,[object Object],[object Object],[object Object],[object Object]
Lightweight history (4) ,[object Object],[object Object],[object Object],[object Object]
Lightweight history (5) ,[object Object],[object Object],[object Object],[object Object]
Analogical memory-based language processing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Analogy (1) sequence  a sequence  b sequence  b’ sequence  a’ is similar to is similar to maps to maps to
Analogy (2) sequence  a sequence  b ? sequence  a’ is similar to is similar to maps to
Analogy (3) sequence  n sequence  b ? are similar to are similar to map to sequence  a sequence  f sequence  a’ sequence  n’ sequence  f’
Memory-based parsing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
CGN treebank
Make data (1) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Make data (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Make data (3) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Crash course:  Machine Learning ,[object Object],[object Object],[object Object],[object Object],[object Object]
Machine Learning: Roots ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Machine Learning: 2 strands ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Empirical ML: Key Terms 1 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Empirical ML: Key Terms 2 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Empirical ML: 2 Flavours ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Greedy learning
Greedy learning
Lazy Learning
Lazy Learning
Greedy vs Lazy Learning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Greedy vs Lazy Learning ,[object Object],[object Object],[object Object],[object Object],[object Object]
Greedy vs Lazy Learning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Greedy vs Lazy Learning + abstraction - abstraction + generalization - generalization Decision Tree Induction Hyperplane discriminators Regression Handcrafting Table Lookup Memory-Based Learning
Greedy vs Lazy: So? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
ML and Natural Language ,[object Object],[object Object],[object Object],[object Object],[object Object]
Entropy & IG: Formulas

More Related Content

What's hot

슬라이드 1
슬라이드 1슬라이드 1
슬라이드 1butest
 
A Constructive Mathematics approach for NL formal grammars
A Constructive Mathematics approach for NL formal grammarsA Constructive Mathematics approach for NL formal grammars
A Constructive Mathematics approach for NL formal grammarsFederico Gobbo
 
A Low Dimensionality Representation for Language Variety Identification (CICL...
A Low Dimensionality Representation for Language Variety Identification (CICL...A Low Dimensionality Representation for Language Variety Identification (CICL...
A Low Dimensionality Representation for Language Variety Identification (CICL...Francisco Manuel Rangel Pardo
 
Smart Content = Smart Business
Smart Content = Smart BusinessSmart Content = Smart Business
Smart Content = Smart BusinessSeth Grimes
 
Language Variety Identification using Distributed Representations of Words an...
Language Variety Identification using Distributed Representations of Words an...Language Variety Identification using Distributed Representations of Words an...
Language Variety Identification using Distributed Representations of Words an...Francisco Manuel Rangel Pardo
 
Text mining introduction-1
Text mining   introduction-1Text mining   introduction-1
Text mining introduction-1Sumit Sony
 
Tutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and SystemsTutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and SystemsAdrian Paschke
 
Practical cases, Applied linguistics course (MUI)
Practical cases, Applied linguistics course (MUI)Practical cases, Applied linguistics course (MUI)
Practical cases, Applied linguistics course (MUI)Alex Curtis
 
GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003butest
 
Knowledge base system appl. p 1,2-ver1
Knowledge base system appl.  p 1,2-ver1Knowledge base system appl.  p 1,2-ver1
Knowledge base system appl. p 1,2-ver1Taymoor Nazmy
 
7 probability and statistics an introduction
7 probability and statistics an introduction7 probability and statistics an introduction
7 probability and statistics an introductionThennarasuSakkan
 
Semantic Web languages: Expressivity vs scalability
Semantic Web languages: Expressivity vs scalabilitySemantic Web languages: Expressivity vs scalability
Semantic Web languages: Expressivity vs scalabilitynvitucci
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingTony Russell-Rose
 
Eswc2012 ss ontologies
Eswc2012 ss ontologiesEswc2012 ss ontologies
Eswc2012 ss ontologiesElena Simperl
 

What's hot (20)

슬라이드 1
슬라이드 1슬라이드 1
슬라이드 1
 
A Constructive Mathematics approach for NL formal grammars
A Constructive Mathematics approach for NL formal grammarsA Constructive Mathematics approach for NL formal grammars
A Constructive Mathematics approach for NL formal grammars
 
A Low Dimensionality Representation for Language Variety Identification (CICL...
A Low Dimensionality Representation for Language Variety Identification (CICL...A Low Dimensionality Representation for Language Variety Identification (CICL...
A Low Dimensionality Representation for Language Variety Identification (CICL...
 
Smart Content = Smart Business
Smart Content = Smart BusinessSmart Content = Smart Business
Smart Content = Smart Business
 
Language Variety Identification using Distributed Representations of Words an...
Language Variety Identification using Distributed Representations of Words an...Language Variety Identification using Distributed Representations of Words an...
Language Variety Identification using Distributed Representations of Words an...
 
Text mining introduction-1
Text mining   introduction-1Text mining   introduction-1
Text mining introduction-1
 
Tutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and SystemsTutorial - Introduction to Rule Technologies and Systems
Tutorial - Introduction to Rule Technologies and Systems
 
Language models
Language modelsLanguage models
Language models
 
Ontology learning
Ontology learningOntology learning
Ontology learning
 
Lecture20 xing
Lecture20 xingLecture20 xing
Lecture20 xing
 
Practical cases, Applied linguistics course (MUI)
Practical cases, Applied linguistics course (MUI)Practical cases, Applied linguistics course (MUI)
Practical cases, Applied linguistics course (MUI)
 
Structured Knowledge Representation
Structured Knowledge RepresentationStructured Knowledge Representation
Structured Knowledge Representation
 
Pargram2011
Pargram2011Pargram2011
Pargram2011
 
Information Extraction
Information ExtractionInformation Extraction
Information Extraction
 
GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003GATE, HLT and Machine Learning, Sheffield, July 2003
GATE, HLT and Machine Learning, Sheffield, July 2003
 
Knowledge base system appl. p 1,2-ver1
Knowledge base system appl.  p 1,2-ver1Knowledge base system appl.  p 1,2-ver1
Knowledge base system appl. p 1,2-ver1
 
7 probability and statistics an introduction
7 probability and statistics an introduction7 probability and statistics an introduction
7 probability and statistics an introduction
 
Semantic Web languages: Expressivity vs scalability
Semantic Web languages: Expressivity vs scalabilitySemantic Web languages: Expressivity vs scalability
Semantic Web languages: Expressivity vs scalability
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Eswc2012 ss ontologies
Eswc2012 ss ontologiesEswc2012 ss ontologies
Eswc2012 ss ontologies
 

Similar to PPT slides

Copy of 10text (2)
Copy of 10text (2)Copy of 10text (2)
Copy of 10text (2)Uma Se
 
Chapter 10 Data Mining Techniques
 Chapter 10 Data Mining Techniques Chapter 10 Data Mining Techniques
Chapter 10 Data Mining TechniquesHouw Liong The
 
Semantic Search Component
Semantic Search ComponentSemantic Search Component
Semantic Search ComponentMario Flecha
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Bhaskar Mitra
 
The Geometry of Learning
The Geometry of LearningThe Geometry of Learning
The Geometry of Learningfridolin.wild
 
Topic models, vector semantics and applications
Topic models, vector semantics and applicationsTopic models, vector semantics and applications
Topic models, vector semantics and applicationsVasileios Lampos
 
Semantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaSemantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaGiorgia Lodi
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
download
downloaddownload
downloadbutest
 
download
downloaddownload
downloadbutest
 
Meaningful Interaction Analysis
Meaningful Interaction AnalysisMeaningful Interaction Analysis
Meaningful Interaction Analysisfridolin.wild
 
Towards a theory of semantic communication
Towards a theory of semantic communicationTowards a theory of semantic communication
Towards a theory of semantic communicationJie Bao
 
Greimas.exe: digital code in the generative trajectory of expression
Greimas.exe: digital code in the generative trajectory of expressionGreimas.exe: digital code in the generative trajectory of expression
Greimas.exe: digital code in the generative trajectory of expressionEverardo Reyes-García
 
Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04Rinke Hoekstra
 
Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Cornelius Puschmann
 

Similar to PPT slides (20)

Topical_Facets
Topical_FacetsTopical_Facets
Topical_Facets
 
Copy of 10text (2)
Copy of 10text (2)Copy of 10text (2)
Copy of 10text (2)
 
Web and text
Web and textWeb and text
Web and text
 
Chapter 10 Data Mining Techniques
 Chapter 10 Data Mining Techniques Chapter 10 Data Mining Techniques
Chapter 10 Data Mining Techniques
 
Semantic Search Component
Semantic Search ComponentSemantic Search Component
Semantic Search Component
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
The Geometry of Learning
The Geometry of LearningThe Geometry of Learning
The Geometry of Learning
 
ICAME 2010
ICAME 2010ICAME 2010
ICAME 2010
 
Topic models, vector semantics and applications
Topic models, vector semantics and applicationsTopic models, vector semantics and applications
Topic models, vector semantics and applications
 
Semantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenzaSemantic Interoperability - grafi della conoscenza
Semantic Interoperability - grafi della conoscenza
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
download
downloaddownload
download
 
download
downloaddownload
download
 
Meaningful Interaction Analysis
Meaningful Interaction AnalysisMeaningful Interaction Analysis
Meaningful Interaction Analysis
 
Towards a theory of semantic communication
Towards a theory of semantic communicationTowards a theory of semantic communication
Towards a theory of semantic communication
 
Greimas.exe: digital code in the generative trajectory of expression
Greimas.exe: digital code in the generative trajectory of expressionGreimas.exe: digital code in the generative trajectory of expression
Greimas.exe: digital code in the generative trajectory of expression
 
Eurolan 2005 Pedersen
Eurolan 2005 PedersenEurolan 2005 Pedersen
Eurolan 2005 Pedersen
 
Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04Lri Owl And Ontologies 04 04
Lri Owl And Ontologies 04 04
 
Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)
 
Nlp
NlpNlp
Nlp
 

More from butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

More from butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

PPT slides

  • 1. Language Technology (A Machine Learning Approach) Antal van den Bosch & Walter Daelemans [email_address] [email_address] http://ilk.uvt.nl/~antalb/ltua
  • 2.
  • 3.
  • 5. Text Meaning LT Components Applications Lexical / Morphological Analysis Syntactic Analysis Semantic Analysis Discourse Analysis Tagging Chunking Word Sense Disambiguation Grammatical Relation Finding Named Entity Recognition Reference Resolution OCR Spelling Error Correction Grammar Checking Information retrieval Information Extraction Summarization Machine Translation Document Classification Ontology Extraction and Refinement Question Answering Dialogue Systems
  • 6.
  • 7.
  • 8. Meaning Speech Text Speech recognition Language comprehension Text generation Speech synthesis translation
  • 9.
  • 10.
  • 11.
  • 12.
  • 13. What is meaning ? Eleni eats a pizza with banana Pizza = {p1, p2, p3, …} Eat ={<Nicolas,p1>,<Nicolas,p3>,<Eleni,p2>,…} Contain ={<p1,ansjovis>,<p1,tomaat>,<p2,banaan>,…} x=p2 Semantic networks, Frames First-order predicate calculus Set theory Eleni banana pizza eat contain  (x): pizza(x)  eat (Eleni,x)  contain (x, banaan ) “ Symbol grounding” problem Representation and processing of time, causality, modality, defaults, common sense, … “ Meaning is in the mind of the beholder”
  • 14. Language Technology Language Data Model Input Representation Output Representation Acquisition Processing
  • 15. Deductive Route Language Data Model Input Representation Output Representation S -> NP VP NP -> ART A N NP -> ART N VP -> V NP ...
  • 16.
  • 17. Inductive Route Language Data Model Input Representation Output Representation De computertaalkunde, in navolging van de taalkunde aanvankelijk sterk regel-gebaseerd, is onder druk van toepasbaarheid en grotere rekenkracht de laatste tien jaar geleidelijk geevolueerd naar een meer statistische, corpus-gebaseerde en inductieve benadering. De laatste jaren hebben ook technieken uit de theorie van zelflerende systemen (Machine Learning, zie Mitchell, 1998 voor een inleiding) aan belang gewonnen. Deze technieken zijn in zekere zin p(corpustaalkunde|de), p(in|corpustaalkunde), p(corpustaalkunde), p(de), p(in), ...
  • 18.
  • 19.
  • 20.
  • 21.
  • 22. Document Set of Documents Author Recognition Document Dating Language Identification Text Categorization Information Extraction Summarization Question Answering Topic Detection and Tracking Document Clustering Terminology Extraction Ontology Extraction Structured Information Knowledge Discovery + existing data
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38. Analogy (1) sequence a sequence b sequence b’ sequence a’ is similar to is similar to maps to maps to
  • 39. Analogy (2) sequence a sequence b ? sequence a’ is similar to is similar to maps to
  • 40. Analogy (3) sequence n sequence b ? are similar to are similar to map to sequence a sequence f sequence a’ sequence n’ sequence f’
  • 41.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 56.
  • 57.
  • 58.
  • 59. Greedy vs Lazy Learning + abstraction - abstraction + generalization - generalization Decision Tree Induction Hyperplane discriminators Regression Handcrafting Table Lookup Memory-Based Learning
  • 60.
  • 61.
  • 62. Entropy & IG: Formulas