SlideShare a Scribd company logo
Grammar Development
Platform
Miriam Butt
October 2002
Grammar Development
What is a Grammar Development Platform good for?
English: Anna sees the man.
English c-str and f-str
MT
German f-str
German: Anna sieht den
Mann.
• Information Retrieval/Extraction
• Machine Translation (MT)
Parser
Generator
XLE
A Sample Development Platform
• Platforms: Unix (Solaris), Linux, MacOsX
• Software (Shareware): Emacs, Tcl/Tk
XLE (Xerox Linguistic Environment)
• Main Developer: John Maxwell (PARC)
A Sample Development Platform
• Performance: Worst-case exponential,
polynomial in practice (makes broad-coverage
grammars feasible)
• Parser: Bottom-Up, Left-to-Right
XLE (Xerox Linguistic Environment)
• Linguistic Theory: LFG (Lexical-Functional
Grammar) orginally developed by Ronald M.
Kaplan (PARC) and Joan Bresnan (Stanford)
Palo Alto Research
Center (PARC),
English Grammar
IMS, University of Stuttgart
German Grammar
Fuji Xerox
Japanese Grammar
University of Bergen
Norwegian: Bokmal and Nynorsk
UMIST
Urdu Grammar
XRCE Grenoble
French Grammar
The
ParGram
Project
ParGram
Possible Applications:
• Machine Translation (French, English)
• Tree Banking (English, German)
• Smart Text Annotation (German)
• Robust Parsing (English, German, French)
• Information Extraction (English)
• Teaching Tools (Urdu)
Grammar Components
Each Grammar Contains:
• Phrase Structure Rules (S NP VP)
• Lexicon (verb stems and functional elements)
• Finite-State Morphological Analyzer
No Semantics
Phrase Structure Rules
Formulation as used today goes back to Chomsky 1957.
Sample Set for English:
S NP VP
VP V NP
NP D (ADJ) N
Why these kinds of rules?
• Natural Language is recursive and potentially infinite.
• Constituency, X-bar Theory
Phrase Structure Rules
The syntax of natural languages is context-free.
Colorless green ideas sleep furiously.
However, we must also deal with context-sensitive
information.
The monkey sleeps.
The monkey sleep. The monkeys sleeps.
Features and Unifications
Context-Sensitivity can be achieved in many ways.
XLE and LFG (like many other theories/platforms) uses
phrase-structure annotation via attribute-value pairs.
S  NP VP
(SUBJ) =  (SUBJ NUM) = ( NUM)
XLE
Features are checked via Unificaition.
The Ambiguity Problem
PP-Attachment
The girl saw the monkey with the telescope.
XLE
Categorial Ambiguity
Flying planes can be dangerous.
Time flies like an arrow.
Lexicons
• Category Information (Terminal Node in Tree)
• Context Sensitive Featural Information
• Subcategorization Information
• Semantics (sometimes)
Typically Contain:
XLE
Ambiguity in Large Grammars
Ambiguity: a serious problem even in simple sentences
• PP-attachment (English)
• Subject/Object Ambiguities (German)
Within XLE various techniques have been invented to cut down
on the explosion of parses.
• Optimality Marking
• Packed Representations XLE
Morphologies and Tokenizers
Beyond the Word: Writing and adding in
Morphological Analysis and Tokenization
XLE
Parallel Analyses
English: Yassin was seen.
German: Yassin wurde gesehen.
Urdu: yassin dekha gaya
Languages Differ on the Surface (c-structure)
ParGram Goal: The same underlying f-structures
for all languages (modulo lexical semantics).
XLE
The “Parallel” in ParGram
Analyses at the level of f-structure are held as parallel as
possible across languages (crosslinguistic invariance).
• Theoretical Advantage: This models the idea of UG.
• Applicational Advantage: machine translation is made
easier.
Analyses at the level of c-structure are allowed to differ
much more (variance across languages).
FST Morphological Analyzers
Kaplan and Butt (2002): this LFG morphology-syntax interface is
natural:
calana ‘to drive’
(M.Sg)
drive+Verb+Inf+M+S
g
Sequence Relation
surface
form
[VFORM inf]
f-structure
(m-structure)
Lexical Relation [NUM sg]
[GEND masc]
Satisfaction Relation
Seq
L
Sat
PRED ‘drive<Subj,Obj>’
VFORM inf
GEND masc
NUM sg
grammer genration

More Related Content

Similar to grammer genration

Pedagogical applications of corpus data for English for General and Specific ...
Pedagogical applications of corpus data for English for General and Specific ...Pedagogical applications of corpus data for English for General and Specific ...
Pedagogical applications of corpus data for English for General and Specific ...
Pascual Pérez-Paredes
 
FinalReport
FinalReportFinalReport
FinalReport
Vinh Xuan Ho
 
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Normunds Grūzītis
 
Cascon2011_4_parrot+telix
Cascon2011_4_parrot+telixCascon2011_4_parrot+telix
Cascon2011_4_parrot+telix
ONTORULE Project
 
Sparklis exploration et interrogation de points d'accès sparql par interactio...
Sparklis exploration et interrogation de points d'accès sparql par interactio...Sparklis exploration et interrogation de points d'accès sparql par interactio...
Sparklis exploration et interrogation de points d'accès sparql par interactio...
SemWebPro
 
AsiaCALL 2017 presentation
AsiaCALL 2017 presentationAsiaCALL 2017 presentation
AsiaCALL 2017 presentation
Takeshi Sato
 
Wreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognitionWreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognition
Stephen Marquard
 
Neural nets: How regular expressions brought about deep learning
Neural nets: How regular expressions brought about deep learningNeural nets: How regular expressions brought about deep learning
Neural nets: How regular expressions brought about deep learning
Matthew
 
An exploratory corpus study of the AP Spanish
An exploratory corpus study of the AP SpanishAn exploratory corpus study of the AP Spanish
An exploratory corpus study of the AP Spanish
Steven Saffels
 
Parsing techniques
Parsing techniquesParsing techniques
Parsing techniques
Latchezar Tzvetkoff
 
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Facultad de Informática UCM
 
DB-IR-ranking
DB-IR-rankingDB-IR-ranking
DB-IR-ranking
FELIX75
 
Embedding NomLex-BR nominalizations into OpenWordnet-PT
Embedding NomLex-BR nominalizations into OpenWordnet-PTEmbedding NomLex-BR nominalizations into OpenWordnet-PT
Embedding NomLex-BR nominalizations into OpenWordnet-PT
Alexandre Rademaker
 
A History of Computer Programming Languages.pdf
A History of Computer Programming Languages.pdfA History of Computer Programming Languages.pdf
A History of Computer Programming Languages.pdf
Sohaib Roomi
 
DB and IR Integration
DB and IR IntegrationDB and IR Integration
DB and IR Integration
Marco A Torres
 
Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4
DigiGurukul
 
Text Processing with Finite State Transducers in Unitex
Text Processing with Finite State Transducers in UnitexText Processing with Finite State Transducers in Unitex
Text Processing with Finite State Transducers in Unitex
Artem Lukanin
 
Imperative programming
Imperative programmingImperative programming
Imperative programming
Edward Blurock
 
ppt
pptppt
ppt
butest
 
Closing the Gap: Data Models for Documentary Linguistics
Closing the Gap: Data Models for Documentary LinguisticsClosing the Gap: Data Models for Documentary Linguistics
Closing the Gap: Data Models for Documentary Linguistics
Baden Hughes
 

Similar to grammer genration (20)

Pedagogical applications of corpus data for English for General and Specific ...
Pedagogical applications of corpus data for English for General and Specific ...Pedagogical applications of corpus data for English for General and Specific ...
Pedagogical applications of corpus data for English for General and Specific ...
 
FinalReport
FinalReportFinalReport
FinalReport
 
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
 
Cascon2011_4_parrot+telix
Cascon2011_4_parrot+telixCascon2011_4_parrot+telix
Cascon2011_4_parrot+telix
 
Sparklis exploration et interrogation de points d'accès sparql par interactio...
Sparklis exploration et interrogation de points d'accès sparql par interactio...Sparklis exploration et interrogation de points d'accès sparql par interactio...
Sparklis exploration et interrogation de points d'accès sparql par interactio...
 
AsiaCALL 2017 presentation
AsiaCALL 2017 presentationAsiaCALL 2017 presentation
AsiaCALL 2017 presentation
 
Wreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognitionWreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognition
 
Neural nets: How regular expressions brought about deep learning
Neural nets: How regular expressions brought about deep learningNeural nets: How regular expressions brought about deep learning
Neural nets: How regular expressions brought about deep learning
 
An exploratory corpus study of the AP Spanish
An exploratory corpus study of the AP SpanishAn exploratory corpus study of the AP Spanish
An exploratory corpus study of the AP Spanish
 
Parsing techniques
Parsing techniquesParsing techniques
Parsing techniques
 
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
 
DB-IR-ranking
DB-IR-rankingDB-IR-ranking
DB-IR-ranking
 
Embedding NomLex-BR nominalizations into OpenWordnet-PT
Embedding NomLex-BR nominalizations into OpenWordnet-PTEmbedding NomLex-BR nominalizations into OpenWordnet-PT
Embedding NomLex-BR nominalizations into OpenWordnet-PT
 
A History of Computer Programming Languages.pdf
A History of Computer Programming Languages.pdfA History of Computer Programming Languages.pdf
A History of Computer Programming Languages.pdf
 
DB and IR Integration
DB and IR IntegrationDB and IR Integration
DB and IR Integration
 
Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4
 
Text Processing with Finite State Transducers in Unitex
Text Processing with Finite State Transducers in UnitexText Processing with Finite State Transducers in Unitex
Text Processing with Finite State Transducers in Unitex
 
Imperative programming
Imperative programmingImperative programming
Imperative programming
 
ppt
pptppt
ppt
 
Closing the Gap: Data Models for Documentary Linguistics
Closing the Gap: Data Models for Documentary LinguisticsClosing the Gap: Data Models for Documentary Linguistics
Closing the Gap: Data Models for Documentary Linguistics
 

More from shakeelAsghar6

nlp-150531043309-lva1-app6891_3.pdf
nlp-150531043309-lva1-app6891_3.pdfnlp-150531043309-lva1-app6891_3.pdf
nlp-150531043309-lva1-app6891_3.pdf
shakeelAsghar6
 
naturallanguageprocessing-160722053804.pdf
naturallanguageprocessing-160722053804.pdfnaturallanguageprocessing-160722053804.pdf
naturallanguageprocessing-160722053804.pdf
shakeelAsghar6
 
NLP
NLPNLP
natural language processing
natural language processing natural language processing
natural language processing
shakeelAsghar6
 
grammer genration in design
grammer genration in design grammer genration in design
grammer genration in design
shakeelAsghar6
 
Natural Language Processing - English Grammar
 Natural Language Processing- English Grammar  Natural Language Processing- English Grammar
Natural Language Processing - English Grammar
shakeelAsghar6
 
grammer genration
grammer genration grammer genration
grammer genration
shakeelAsghar6
 
Shakeel
ShakeelShakeel

More from shakeelAsghar6 (8)

nlp-150531043309-lva1-app6891_3.pdf
nlp-150531043309-lva1-app6891_3.pdfnlp-150531043309-lva1-app6891_3.pdf
nlp-150531043309-lva1-app6891_3.pdf
 
naturallanguageprocessing-160722053804.pdf
naturallanguageprocessing-160722053804.pdfnaturallanguageprocessing-160722053804.pdf
naturallanguageprocessing-160722053804.pdf
 
NLP
NLPNLP
NLP
 
natural language processing
natural language processing natural language processing
natural language processing
 
grammer genration in design
grammer genration in design grammer genration in design
grammer genration in design
 
Natural Language Processing - English Grammar
 Natural Language Processing- English Grammar  Natural Language Processing- English Grammar
Natural Language Processing - English Grammar
 
grammer genration
grammer genration grammer genration
grammer genration
 
Shakeel
ShakeelShakeel
Shakeel
 

Recently uploaded

Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Excellence Foundation for South Sudan
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
mulvey2
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
AyyanKhan40
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
ak6969907
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Nguyen Thanh Tu Collection
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
Smart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICTSmart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICT
simonomuemu
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
Priyankaranawat4
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
Celine George
 
Assessment and Planning in Educational technology.pptx
Assessment and Planning in Educational technology.pptxAssessment and Planning in Educational technology.pptx
Assessment and Planning in Educational technology.pptx
Kavitha Krishnan
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
adhitya5119
 

Recently uploaded (20)

Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
Smart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICTSmart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICT
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
 
How to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold MethodHow to Build a Module in Odoo 17 Using the Scaffold Method
How to Build a Module in Odoo 17 Using the Scaffold Method
 
Assessment and Planning in Educational technology.pptx
Assessment and Planning in Educational technology.pptxAssessment and Planning in Educational technology.pptx
Assessment and Planning in Educational technology.pptx
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
 

grammer genration

  • 2. Grammar Development What is a Grammar Development Platform good for? English: Anna sees the man. English c-str and f-str MT German f-str German: Anna sieht den Mann. • Information Retrieval/Extraction • Machine Translation (MT) Parser Generator XLE
  • 3. A Sample Development Platform • Platforms: Unix (Solaris), Linux, MacOsX • Software (Shareware): Emacs, Tcl/Tk XLE (Xerox Linguistic Environment) • Main Developer: John Maxwell (PARC)
  • 4. A Sample Development Platform • Performance: Worst-case exponential, polynomial in practice (makes broad-coverage grammars feasible) • Parser: Bottom-Up, Left-to-Right XLE (Xerox Linguistic Environment) • Linguistic Theory: LFG (Lexical-Functional Grammar) orginally developed by Ronald M. Kaplan (PARC) and Joan Bresnan (Stanford)
  • 5. Palo Alto Research Center (PARC), English Grammar IMS, University of Stuttgart German Grammar Fuji Xerox Japanese Grammar University of Bergen Norwegian: Bokmal and Nynorsk UMIST Urdu Grammar XRCE Grenoble French Grammar The ParGram Project
  • 6. ParGram Possible Applications: • Machine Translation (French, English) • Tree Banking (English, German) • Smart Text Annotation (German) • Robust Parsing (English, German, French) • Information Extraction (English) • Teaching Tools (Urdu)
  • 7. Grammar Components Each Grammar Contains: • Phrase Structure Rules (S NP VP) • Lexicon (verb stems and functional elements) • Finite-State Morphological Analyzer No Semantics
  • 8. Phrase Structure Rules Formulation as used today goes back to Chomsky 1957. Sample Set for English: S NP VP VP V NP NP D (ADJ) N Why these kinds of rules? • Natural Language is recursive and potentially infinite. • Constituency, X-bar Theory
  • 9. Phrase Structure Rules The syntax of natural languages is context-free. Colorless green ideas sleep furiously. However, we must also deal with context-sensitive information. The monkey sleeps. The monkey sleep. The monkeys sleeps.
  • 10. Features and Unifications Context-Sensitivity can be achieved in many ways. XLE and LFG (like many other theories/platforms) uses phrase-structure annotation via attribute-value pairs. S  NP VP (SUBJ) =  (SUBJ NUM) = ( NUM) XLE Features are checked via Unificaition.
  • 11. The Ambiguity Problem PP-Attachment The girl saw the monkey with the telescope. XLE Categorial Ambiguity Flying planes can be dangerous. Time flies like an arrow.
  • 12. Lexicons • Category Information (Terminal Node in Tree) • Context Sensitive Featural Information • Subcategorization Information • Semantics (sometimes) Typically Contain: XLE
  • 13. Ambiguity in Large Grammars Ambiguity: a serious problem even in simple sentences • PP-attachment (English) • Subject/Object Ambiguities (German) Within XLE various techniques have been invented to cut down on the explosion of parses. • Optimality Marking • Packed Representations XLE
  • 14. Morphologies and Tokenizers Beyond the Word: Writing and adding in Morphological Analysis and Tokenization XLE
  • 15. Parallel Analyses English: Yassin was seen. German: Yassin wurde gesehen. Urdu: yassin dekha gaya Languages Differ on the Surface (c-structure) ParGram Goal: The same underlying f-structures for all languages (modulo lexical semantics). XLE
  • 16. The “Parallel” in ParGram Analyses at the level of f-structure are held as parallel as possible across languages (crosslinguistic invariance). • Theoretical Advantage: This models the idea of UG. • Applicational Advantage: machine translation is made easier. Analyses at the level of c-structure are allowed to differ much more (variance across languages).
  • 17. FST Morphological Analyzers Kaplan and Butt (2002): this LFG morphology-syntax interface is natural: calana ‘to drive’ (M.Sg) drive+Verb+Inf+M+S g Sequence Relation surface form [VFORM inf] f-structure (m-structure) Lexical Relation [NUM sg] [GEND masc] Satisfaction Relation Seq L Sat PRED ‘drive<Subj,Obj>’ VFORM inf GEND masc NUM sg