SlideShare a Scribd company logo
1 of 18
Grammar Development
Platform
Miriam Butt
October 2002
Grammar Development
What is a Grammar Development Platform good for?
English: Anna sees the man.
English c-str and f-str
MT
German f-str
German: Anna sieht den
Mann.
• Information Retrieval/Extraction
• Machine Translation (MT)
Parser
Generator
XLE
A Sample Development Platform
• Platforms: Unix (Solaris), Linux, MacOsX
• Software (Shareware): Emacs, Tcl/Tk
XLE (Xerox Linguistic Environment)
• Main Developer: John Maxwell (PARC)
A Sample Development Platform
• Performance: Worst-case exponential,
polynomial in practice (makes broad-coverage
grammars feasible)
• Parser: Bottom-Up, Left-to-Right
XLE (Xerox Linguistic Environment)
• Linguistic Theory: LFG (Lexical-Functional
Grammar) orginally developed by Ronald M.
Kaplan (PARC) and Joan Bresnan (Stanford)
Palo Alto Research
Center (PARC),
English Grammar
IMS, University of Stuttgart
German Grammar
Fuji Xerox
Japanese Grammar
University of Bergen
Norwegian: Bokmal and Nynorsk
UMIST
Urdu Grammar
XRCE Grenoble
French Grammar
The
ParGram
Project
ParGram
Possible Applications:
• Machine Translation (French, English)
• Tree Banking (English, German)
• Smart Text Annotation (German)
• Robust Parsing (English, German, French)
• Information Extraction (English)
• Teaching Tools (Urdu)
Grammar Components
Each Grammar Contains:
• Phrase Structure Rules (S NP VP)
• Lexicon (verb stems and functional elements)
• Finite-State Morphological Analyzer
No Semantics
Phrase Structure Rules
Formulation as used today goes back to Chomsky 1957.
Sample Set for English:
S NP VP
VP V NP
NP D (ADJ) N
Why these kinds of rules?
• Natural Language is recursive and potentially infinite.
• Constituency, X-bar Theory
Phrase Structure Rules
The syntax of natural languages is context-free.
Colorless green ideas sleep furiously.
However, we must also deal with context-sensitive
information.
The monkey sleeps.
The monkey sleep. The monkeys sleeps.
Features and Unifications
Context-Sensitivity can be achieved in many ways.
XLE and LFG (like many other theories/platforms) uses
phrase-structure annotation via attribute-value pairs.
S  NP VP
(SUBJ) =  (SUBJ NUM) = ( NUM)
XLE
Features are checked via Unificaition.
The Ambiguity Problem
PP-Attachment
The girl saw the monkey with the telescope.
XLE
Categorial Ambiguity
Flying planes can be dangerous.
Time flies like an arrow.
Lexicons
• Category Information (Terminal Node in Tree)
• Context Sensitive Featural Information
• Subcategorization Information
• Semantics (sometimes)
Typically Contain:
XLE
Ambiguity in Large Grammars
Ambiguity: a serious problem even in simple sentences
• PP-attachment (English)
• Subject/Object Ambiguities (German)
Within XLE various techniques have been invented to cut down
on the explosion of parses.
• Optimality Marking
• Packed Representations XLE
Morphologies and Tokenizers
Beyond the Word: Writing and adding in
Morphological Analysis and Tokenization
XLE
Parallel Analyses
English: Yassin was seen.
German: Yassin wurde gesehen.
Urdu: yassin dekha gaya
Languages Differ on the Surface (c-structure)
ParGram Goal: The same underlying f-structures
for all languages (modulo lexical semantics).
XLE
The “Parallel” in ParGram
Analyses at the level of f-structure are held as parallel as
possible across languages (crosslinguistic invariance).
• Theoretical Advantage: This models the idea of UG.
• Applicational Advantage: machine translation is made
easier.
Analyses at the level of c-structure are allowed to differ
much more (variance across languages).
FST Morphological Analyzers
Kaplan and Butt (2002): this LFG morphology-syntax interface is
natural:
calana ‘to drive’
(M.Sg)
drive+Verb+Inf+M+S
g
Sequence Relation
surface
form
[VFORM inf]
f-structure
(m-structure)
Lexical Relation [NUM sg]
[GEND masc]
Satisfaction Relation
Seq
L
Sat
PRED ‘drive<Subj,Obj>’
VFORM inf
GEND masc
NUM sg
grammer genration

More Related Content

Similar to grammer genration

Pedagogical applications of corpus data for English for General and Specific ...
Pedagogical applications of corpus data for English for General and Specific ...Pedagogical applications of corpus data for English for General and Specific ...
Pedagogical applications of corpus data for English for General and Specific ...Pascual Pérez-Paredes
 
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...Normunds Grūzītis
 
Sparklis exploration et interrogation de points d'accès sparql par interactio...
Sparklis exploration et interrogation de points d'accès sparql par interactio...Sparklis exploration et interrogation de points d'accès sparql par interactio...
Sparklis exploration et interrogation de points d'accès sparql par interactio...SemWebPro
 
AsiaCALL 2017 presentation
AsiaCALL 2017 presentationAsiaCALL 2017 presentation
AsiaCALL 2017 presentationTakeshi Sato
 
Wreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognitionWreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognitionStephen Marquard
 
Neural nets: How regular expressions brought about deep learning
Neural nets: How regular expressions brought about deep learningNeural nets: How regular expressions brought about deep learning
Neural nets: How regular expressions brought about deep learningMatthew
 
An exploratory corpus study of the AP Spanish
An exploratory corpus study of the AP SpanishAn exploratory corpus study of the AP Spanish
An exploratory corpus study of the AP SpanishSteven Saffels
 
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...Facultad de Informática UCM
 
DB-IR-ranking
DB-IR-rankingDB-IR-ranking
DB-IR-rankingFELIX75
 
Embedding NomLex-BR nominalizations into OpenWordnet-PT
Embedding NomLex-BR nominalizations into OpenWordnet-PTEmbedding NomLex-BR nominalizations into OpenWordnet-PT
Embedding NomLex-BR nominalizations into OpenWordnet-PTAlexandre Rademaker
 
A History of Computer Programming Languages.pdf
A History of Computer Programming Languages.pdfA History of Computer Programming Languages.pdf
A History of Computer Programming Languages.pdfSohaib Roomi
 
Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4DigiGurukul
 
Text Processing with Finite State Transducers in Unitex
Text Processing with Finite State Transducers in UnitexText Processing with Finite State Transducers in Unitex
Text Processing with Finite State Transducers in UnitexArtem Lukanin
 
Imperative programming
Imperative programmingImperative programming
Imperative programmingEdward Blurock
 
Closing the Gap: Data Models for Documentary Linguistics
Closing the Gap: Data Models for Documentary LinguisticsClosing the Gap: Data Models for Documentary Linguistics
Closing the Gap: Data Models for Documentary LinguisticsBaden Hughes
 

Similar to grammer genration (20)

Pedagogical applications of corpus data for English for General and Specific ...
Pedagogical applications of corpus data for English for General and Specific ...Pedagogical applications of corpus data for English for General and Specific ...
Pedagogical applications of corpus data for English for General and Specific ...
 
FinalReport
FinalReportFinalReport
FinalReport
 
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
Controlled Natural Language Generation from a Multilingual FrameNet-based Gra...
 
Cascon2011_4_parrot+telix
Cascon2011_4_parrot+telixCascon2011_4_parrot+telix
Cascon2011_4_parrot+telix
 
Sparklis exploration et interrogation de points d'accès sparql par interactio...
Sparklis exploration et interrogation de points d'accès sparql par interactio...Sparklis exploration et interrogation de points d'accès sparql par interactio...
Sparklis exploration et interrogation de points d'accès sparql par interactio...
 
AsiaCALL 2017 presentation
AsiaCALL 2017 presentationAsiaCALL 2017 presentation
AsiaCALL 2017 presentation
 
Wreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognitionWreck a nice beach: adventures in speech recognition
Wreck a nice beach: adventures in speech recognition
 
Neural nets: How regular expressions brought about deep learning
Neural nets: How regular expressions brought about deep learningNeural nets: How regular expressions brought about deep learning
Neural nets: How regular expressions brought about deep learning
 
An exploratory corpus study of the AP Spanish
An exploratory corpus study of the AP SpanishAn exploratory corpus study of the AP Spanish
An exploratory corpus study of the AP Spanish
 
Parsing techniques
Parsing techniquesParsing techniques
Parsing techniques
 
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
Languages, Ontologies and Automatic Grammar Generation - Prof. Pedro Rangel H...
 
DB-IR-ranking
DB-IR-rankingDB-IR-ranking
DB-IR-ranking
 
Embedding NomLex-BR nominalizations into OpenWordnet-PT
Embedding NomLex-BR nominalizations into OpenWordnet-PTEmbedding NomLex-BR nominalizations into OpenWordnet-PT
Embedding NomLex-BR nominalizations into OpenWordnet-PT
 
A History of Computer Programming Languages.pdf
A History of Computer Programming Languages.pdfA History of Computer Programming Languages.pdf
A History of Computer Programming Languages.pdf
 
DB and IR Integration
DB and IR IntegrationDB and IR Integration
DB and IR Integration
 
Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4
 
Text Processing with Finite State Transducers in Unitex
Text Processing with Finite State Transducers in UnitexText Processing with Finite State Transducers in Unitex
Text Processing with Finite State Transducers in Unitex
 
Imperative programming
Imperative programmingImperative programming
Imperative programming
 
ppt
pptppt
ppt
 
Closing the Gap: Data Models for Documentary Linguistics
Closing the Gap: Data Models for Documentary LinguisticsClosing the Gap: Data Models for Documentary Linguistics
Closing the Gap: Data Models for Documentary Linguistics
 

More from shakeelAsghar6

nlp-150531043309-lva1-app6891_3.pdf
nlp-150531043309-lva1-app6891_3.pdfnlp-150531043309-lva1-app6891_3.pdf
nlp-150531043309-lva1-app6891_3.pdfshakeelAsghar6
 
naturallanguageprocessing-160722053804.pdf
naturallanguageprocessing-160722053804.pdfnaturallanguageprocessing-160722053804.pdf
naturallanguageprocessing-160722053804.pdfshakeelAsghar6
 
natural language processing
natural language processing natural language processing
natural language processing shakeelAsghar6
 
grammer genration in design
grammer genration in design grammer genration in design
grammer genration in design shakeelAsghar6
 
Natural Language Processing - English Grammar
 Natural Language Processing- English Grammar  Natural Language Processing- English Grammar
Natural Language Processing - English Grammar shakeelAsghar6
 

More from shakeelAsghar6 (8)

nlp-150531043309-lva1-app6891_3.pdf
nlp-150531043309-lva1-app6891_3.pdfnlp-150531043309-lva1-app6891_3.pdf
nlp-150531043309-lva1-app6891_3.pdf
 
naturallanguageprocessing-160722053804.pdf
naturallanguageprocessing-160722053804.pdfnaturallanguageprocessing-160722053804.pdf
naturallanguageprocessing-160722053804.pdf
 
NLP
NLPNLP
NLP
 
natural language processing
natural language processing natural language processing
natural language processing
 
grammer genration in design
grammer genration in design grammer genration in design
grammer genration in design
 
Natural Language Processing - English Grammar
 Natural Language Processing- English Grammar  Natural Language Processing- English Grammar
Natural Language Processing - English Grammar
 
grammer genration
grammer genration grammer genration
grammer genration
 
Shakeel
ShakeelShakeel
Shakeel
 

Recently uploaded

Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 

Recently uploaded (20)

Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 

grammer genration

  • 2. Grammar Development What is a Grammar Development Platform good for? English: Anna sees the man. English c-str and f-str MT German f-str German: Anna sieht den Mann. • Information Retrieval/Extraction • Machine Translation (MT) Parser Generator XLE
  • 3. A Sample Development Platform • Platforms: Unix (Solaris), Linux, MacOsX • Software (Shareware): Emacs, Tcl/Tk XLE (Xerox Linguistic Environment) • Main Developer: John Maxwell (PARC)
  • 4. A Sample Development Platform • Performance: Worst-case exponential, polynomial in practice (makes broad-coverage grammars feasible) • Parser: Bottom-Up, Left-to-Right XLE (Xerox Linguistic Environment) • Linguistic Theory: LFG (Lexical-Functional Grammar) orginally developed by Ronald M. Kaplan (PARC) and Joan Bresnan (Stanford)
  • 5. Palo Alto Research Center (PARC), English Grammar IMS, University of Stuttgart German Grammar Fuji Xerox Japanese Grammar University of Bergen Norwegian: Bokmal and Nynorsk UMIST Urdu Grammar XRCE Grenoble French Grammar The ParGram Project
  • 6. ParGram Possible Applications: • Machine Translation (French, English) • Tree Banking (English, German) • Smart Text Annotation (German) • Robust Parsing (English, German, French) • Information Extraction (English) • Teaching Tools (Urdu)
  • 7. Grammar Components Each Grammar Contains: • Phrase Structure Rules (S NP VP) • Lexicon (verb stems and functional elements) • Finite-State Morphological Analyzer No Semantics
  • 8. Phrase Structure Rules Formulation as used today goes back to Chomsky 1957. Sample Set for English: S NP VP VP V NP NP D (ADJ) N Why these kinds of rules? • Natural Language is recursive and potentially infinite. • Constituency, X-bar Theory
  • 9. Phrase Structure Rules The syntax of natural languages is context-free. Colorless green ideas sleep furiously. However, we must also deal with context-sensitive information. The monkey sleeps. The monkey sleep. The monkeys sleeps.
  • 10. Features and Unifications Context-Sensitivity can be achieved in many ways. XLE and LFG (like many other theories/platforms) uses phrase-structure annotation via attribute-value pairs. S  NP VP (SUBJ) =  (SUBJ NUM) = ( NUM) XLE Features are checked via Unificaition.
  • 11. The Ambiguity Problem PP-Attachment The girl saw the monkey with the telescope. XLE Categorial Ambiguity Flying planes can be dangerous. Time flies like an arrow.
  • 12. Lexicons • Category Information (Terminal Node in Tree) • Context Sensitive Featural Information • Subcategorization Information • Semantics (sometimes) Typically Contain: XLE
  • 13. Ambiguity in Large Grammars Ambiguity: a serious problem even in simple sentences • PP-attachment (English) • Subject/Object Ambiguities (German) Within XLE various techniques have been invented to cut down on the explosion of parses. • Optimality Marking • Packed Representations XLE
  • 14. Morphologies and Tokenizers Beyond the Word: Writing and adding in Morphological Analysis and Tokenization XLE
  • 15. Parallel Analyses English: Yassin was seen. German: Yassin wurde gesehen. Urdu: yassin dekha gaya Languages Differ on the Surface (c-structure) ParGram Goal: The same underlying f-structures for all languages (modulo lexical semantics). XLE
  • 16. The “Parallel” in ParGram Analyses at the level of f-structure are held as parallel as possible across languages (crosslinguistic invariance). • Theoretical Advantage: This models the idea of UG. • Applicational Advantage: machine translation is made easier. Analyses at the level of c-structure are allowed to differ much more (variance across languages).
  • 17. FST Morphological Analyzers Kaplan and Butt (2002): this LFG morphology-syntax interface is natural: calana ‘to drive’ (M.Sg) drive+Verb+Inf+M+S g Sequence Relation surface form [VFORM inf] f-structure (m-structure) Lexical Relation [NUM sg] [GEND masc] Satisfaction Relation Seq L Sat PRED ‘drive<Subj,Obj>’ VFORM inf GEND masc NUM sg