SlideShare a Scribd company logo
1 of 66
Formal Languages
FSAs in NLP
Hinrich Schütze
IMS, Uni Stuttgart, WS 2006/07
Most slides borrowed from K. Haenelt and E. Gurari
Formal Language Theory
• Two different goals in computational linguistics
• Theoretical interest
– What is the correct formalization of natural language?
– What does this formalization tell us about the properties of natural
language?
– What are the limits of NLP algorithms – in principle?
• E.g., natural language is context free would imply syntactic
analysis of natural language is cubic.
• Practical interest
– Well-understood mathematical and computational framework for
solving NLP problems
– ... even if formalization is not „cognitively“ sound
• Today: finite state for practical applications
Language-Technology
Based on Finite-State-Devices
shallow parsing
head-modifier-
pairs
tokenising
speech
recognition
translation
spelling
correction
dictionaries
rules
analysis
synthesis
transfer
phonology
morphology
fact extraction
text:speech
speech:text
part-of-speech
tagging
Advantages of
Finite-State Devices
• efficiency
– time
• very fast
• if deterministic or low-degree non-determinism
– space:
• compressed representations of data
• = search structure (= hash function)
• system development and maintenance
– modular design and
automatic compilation of system components
– high level specifications
• language modelling
– uniform framework for modelling dictionaries and rules
Modelling Goal
• specification of a formal language that corresponds as closely
as possible to a natural language
• ideally the formal system should
– never undergenerate
(i.e. accept or generate all the strings that characterise a natural
language)
– never overgenerate
(i.e.not accept or generate any string which is not acceptable in a
real language)
• realistically
– natural languages are moving targets (productivity, variations)
– approximations are achievable
– Finite-state is a crude, but useful approximation
(Beesley/Karttunen, 2003)
Syn
Layers of Linguistic Modelling
Lex
The Commission promote information on product in third country
The Commission promotes information on products in third countries
noun
dete verb noun prpo noun prpo noun
NP VP NP PP PP
S
Text
Commission
promote
information product third country
on in
Sem
Main Types of Transducers for
Natural Language Processing
transducers with
- input label - natural language
- output label - interpretation
- weights - variability of data,
- ranking of hypotheses
- n additional output strings
at final states
- delayed output of ambiguities
- assignment of interpretations
s ε
s a w
ε
ee
aw
s a
s a w
w
e
a w
e
s
s
general case: - non-deterministic transducers with ε-transitions
optimisation: - determinisation and minimisation
Lexical Analysis of Natural
Language Texts






















singular
num
pers
present
tense
features
verb
category
:
3
:
:
:
:
Lexical Analysis
Tokenisation Morphological Analysis
Identification of
Basic Units
(Words)
Mapping to
Normalised
Forms
Assignment of
Categories
Assigment of
Features
4
t
0 1
h
2 3
t h i n
5 6
k s

i n k
Recogn. Input L Mapping to Output Language
Tokenization
• Just use white space?
• Tokenization rules?
Properties of
Natural Language Words
• very large set (how many?)
• word formation: concatenation with constraints
– simple words word *dorw
– compound word
• productive compounding and derivations
Drosselklappenpotentiometer
organise → organisation → organisational,
re-organise → re-organisation → re-organisational
be-wald-en *be-feld-en
• contiguous dependencies
go-es *walk-es
un-expect-ed-ly *un-elephant-ed-ly
• discontiguous (long distance) dependencies
expect-s *un-expect-s
mach-st *ge-mach-st
Modelling of
Natural Language Words
• Concatenation rules expressed in terms of
– meaningful word components (including simple words) (morphs)
– and their concatenation (morphotactics)
• Modelling approach
– lexicalisation of sets of meaningful components (morph classes)
– representation of these dictionaries with finite-state transducers
– specification of concatenation
w o
w o r k
r k
i n g
e d
s
b o
b o o k
o k
Modelling of Natural Language
Words: System Overview
affix
(ε:+noun +sg) <#>
(s:+noun +pl) <#>
morpheme classes
morphotactics
phon./orthograph.
alternation rules
special
expressions
lexicon
compiler
lexical transducer
stem
(book:book) <affix>
(box:box) <affix>
#
_
^
/ s
z
x
s
e












([0-9]2“.“)219[0-9]2
...
concatenation of
morpheme classes
Non-Regular Phenomena
• ?
r
Modelling of Words:
Standard Case
noun-stem
book <noun-suffix>
work <noun-suffix>
noun-suffix
(ε:+N +Sg) #
(s:+N +Pl) # w o
w o
r
ε
s
b o
b o o k
o k
+N+Sg +N+Pl
Lexical Transducer
-deterministic
-minimal
- additional output at final states
Modelling of Words: Mapping
Ambiguities between i:o-Language
leave
leave
leaves
left left ε ε
f t
l e
l e a v
a v
ave
ε
s
e
e
ft
Lexical Transducer
-deterministic
-delayed emission at final state
-minimal
Modelling of Words:
Overlapping of Matching Patterns
Wach-stube
Wachs-tube
s t
w a c h s
Lexical Transducer
-non-deterministic with ε-transitions:
- determinisation not possible
- infinite delay of output due to cycle
u b e
t
ε:-
Ambiguities
- cannot be resolved at lexical level
- must be preserved for later analysis steps
require non-deterministic traversal
Modelling of Words:
Non-Disjoint Morpheme Classes
• problem:
– multiplication of start sections
– high degree of non-determinism
• solutions
– pure finite-state devices
• transducer with high degree of non-determinism
– heavy backtracking / parallel search
– slow processing
• determinisation
– explosion of network in size
– extendend device: feature propogation along paths
• merging of morpheme classes
• reduction of degree of non-determinsms
• minimisation
• additional checking of bit-vectors
Modelling of Words:
Non-Disjoint Morpheme Classes:
high degree of non-determinism
noun-stem
book <noun-stem>
<noun-suffix>
work <noun-stem>
<noun-suffix>
verb-suffix
(ε:+V) #
(ed:+V +past) #
(s:+V +3rd) #
N
book
work
V
book
work
ε:+V
vs
ed:+V
s:+V
ns
ε:+N
s:+N
ε
ε
ε
ε
ε
noun-suffix
(ε:+N +Sg) #
(s:+N +Pl) #
verb-stem
book <verb-suffix>
work <verb-suffix>
problem:
- each of the subdivisions must be searched
separateley
- (determinization not feasible:
-leads to explosion of network or
-not possible)
Modelling of Words:
Non-Disjoint Morpheme Classes:
feature propagation
noun-stem
book <noun-stem>
<noun-suffix>
work <noun-stem>
<noun-suffix>
verb-suffix
(ε:+V) #
(ed:+V +past) #
(s:+V +3rd) #
noun-suffix
(ε:+N +Sg) #
(s:+N +Pl) #
verb-stem
book <verb-suffix>
work <verb-suffix>
book {N,V}
work {N,V}
ε {N,V}
ed {V}
s {N,V}
ε {N}
solution:
- interpreting continuation class as bit-feature
- added to the arcs
- checked during traversal (feature intersection)
- merging the dictionaries
- searching only one dictionary
(degree of non-determinism reduced considerably)
Modelling of Words:
Long-Distance Dependencies






present
tense
verb
category
:
:  
participle
category:
*ge-mach-
*ge-mach-
e
st
*ge-wach-st
ge-mach-t
ge-wachs-t
mach-
mach-
mach-
mach-
mach-
mach-
...
wachs-e
wach-e
 

e
st
t
en
t
en
Contiguous
Constraints
Discontiguous
Constraints
invalid
sequences
Constraints on the co-occurrence of morphs within words
Modelling of Words:
Long-Distance Dependencies
modelling alternatives
• pure finite-state transducer
– copies of the network
– can cause explosion in the size of the resulting transducer
• extended finite-state transducer
– „context-free“ extension: simple memory: flags
– special treatment by analysis and generation routine
– keeps transducers small
Modelling of Words:
Long-Distance Dependencies:
network copies
ε
ε verb-stems
verb-stems
ge
inflection
suffixes
en
et
n
t
> 15.000 entries problem:
can cause explosion in the size
of the resulting transducer
Modelling of Words:
Long-Distance Dependencies:
simple memory: flags
verb-stems
ε
ge
inflection
suffixes
en
et
n
t
solution:
-state flags
- procedural interpretation
(bit vector operations)
@set(+ge) @require(+ge)
@require(-ge)
Beesley/Karttunen, 2003
Modelling of Words:
Phon./Orth. Alternation Rules
• phenomena
– pity → pitiless
– fly → flies
– swim → swimming
– delete → deleting
– fox → foxes
• dictionary
• rules: high level specification of regular expressions
• .
noun-stem
dog <noun-suffix>
fox <noun-suffix>
noun-suffix
(ε:+N +Sg) #
(s:+N +Pl) #
#
__
^
/ s
z
s
x
e











 a→b_c [[[[?* b] a ?*] |
[?* a [c ?*]]]]
Beesley/Karttunen, 2003: 61
Modelling of Words:
Phon./Orth. Alternation Rules
• compilation of lexical transducer:
– construction of dictionary transducer
– construction of rule transducer
– composition of lexical transducer and rule transducer
Modelling of Words:
Phon./Orth. Alternation Rules
Jurafsky/Martin, 2000, S. 78
#
__
^
/ s
z
s
x
e












r0 r1 r2 r3 r4
r5
s
#
:e

^:
z,x
z,s,x
#,other
z,s,x
^:
#
other ^:
s
z,s,x
#,other
#,other
e-insertion rule
for English plural nouns
ending with x,s,z („foxes“)
Diagram?
• Compute output for „aabba“
r
w o
w o
r
ε
s
b o
b o o k
o k
+N+Sg +N+Pl
Tisch/Tische
Rat/Räte
• Numbers to Numerals for English (1-99)
• DMOR/SMOR
Modelling of Words:
Phon./Orth. Alternation Rules
61
51
20
10 72 73 84
90
30 40 50 60 70 81
00
+N:ε
f:f o:o x:x
+Pl:^ ε:s ε:#
d:d o:o g:g
dictionary
ε:e
^:ε
^:ε
^:ε
1 2 3 4
5
s
#
z,x
z,s,x
#,other
z,s,x
other
s
z,s,x
#,other
#,other
0
#,
0
1 2
5 6 7 8 9
3 4
dictionary
○ rule


rule
D 
 D
+Pl:^
*:*
o:o g:g
*:*
+N:ε
^:ε s:s
ε:s
#:#
ε:#
d:d
*:*
:
*:* *:* x:x
o:o x:x +N:ε +Pl:^
^:ε ε:e s:s
ε:s
#:#
ε:#
f
f
Time complexity of transducer?
Syn
Layers of Linguistic Modelling
Lex
The Commission promote information on product in third country
The Commission promotes information on products in third countries
noun
dete verb noun prpo noun prpo noun
NP VP NP PP PP
S
Text
Commission
promote
information product third country
on in
Sem
Syntactic Analysis of Natural
Language Texts
Syntactic Analysis
Input Language Output Language
Identification of
well-formed
sequences of words
Assignment of
Syntactic
Categories
Assigment of
Syntactic
Structure
the good example
[NP NP NP]
the good example
NP
Sequences of Words: Properties
and Well-Formedness Conditions
syntactic conditions: type-3
regular language concatenations
• local word ordering principles
– the good example * example good the
– could have been done *been could done have
• global word ordering principles
– (we) (gave) (him) (the book) * (gave) (him) (the book) (we)
concatenations beyond regular languages
• centre embedding (S → a S b)
• obligatorily paired correspondences
• either ... or, if ... then
• can be nested inside each other
the
regulation
defines
which the
commission
had
formulated
which the
Council
had
elected
.
.
.
Sequences of Words: Properties
and Well-Formedness Conditions
syntactic conditions: beyond type-3
Sequences of Words: Properties
and Well-Formedness Conditions
syntactic conditions: beyond type-2
• concatenations beyond context-free languages
– cross-serial dependencies
Jan säit das mer d’chind em Hans es huus lönd hälfe aastriiche
x1 x2 x3 y1 y2 y3
John said that we the children-acc let
Hans-dat help
the house paint
Syntactic Grammars
• complete parsing
– goal: recover complete, exact parses of sentences
– closed-world assumption
• lexicon and grammar are complete
• place all types of conditions into one grammar
• seeking the globally best parse of the entire search space
– problems
• not robust
• too slow for mass data processing
• partial parsing
– goal: recover syntactic information
efficiently and reliably from unrestricted text
– sacrificing completeness and depth of analysis
– open-world assumption
• lexicon and grammar are incomplete
• local decisions
Abney, 1996
Syntactic Grammars:
Complete Sentence Structure ?
with stopper fastening
cork material
or
made of with
bottle closed
PP
PP
AP
NP
NP-c
AP
NP
syntactic link
semantic link
text structure
Syntactic Grammars:
Complete Sentence Parsing
computational problem
• combinatorial explosion of readings
List the sales of products in 1973 3 readings
List the sales of products produced in 1973 10 readings
List the sales of products in 1973 with the
products in 1972
28 readings
List the sales of products produced in 1973
with the products produced in 1972
455 readings
Bod, 1998: 2
„All Grammars Leak“*
• Not possible to provide an exact and
complete characterization
– of all well-formed utterances
– that cleanly divides them from all other sequences
of words which are regarded as ill-formed
utterances
• Rules are not completely ill-founded
• Somehow we need to make things looser to
account for the creativity of language use
*Edward Sapir, 1921
All Grammars Leak
• Example for „leaking“ rule?
All Grammars Leak
• Agreement in English
• “Why do some teachers, parents and
religious leaders feel that celebrating
their religious observances in home and
church are inadequate and deem it
necessary to bring those practices into
the public schools?”
Syntactic Structure: Partial
Parsing Approaches
• finite-state approximation of sentence structures (Abney 1995)
– finite-state cascades: sequences of levels of regular expressions
– recognition approximation: tail-recursion replaced by iteration
– interpretation approximation: embedding replaced by fixed levels
Syntactic Structure:
Finite State Cascades
• functionally equivalent to composition of transducers,
– but without intermediate structure output
– the individual transducers are considerably smaller than a
composed transducer
the good example
[NP NP NP]
2
1 T
T 
the good example
dete adje nomn
1
T
[NP NP NP]
dete adje nomn
2
T
Syntactic Structure:
Finite-State Cascades (Abney)
D N P D N N V-tns Pron
the woman in the lab coat thought you
Aux V-ing
were sleeping
NP P NP VP NP VP
NP PP VP NP VP
S S
L2 ----
L1 ----
L0 ----
L3 ----
T2
T1
T3
Finite-State Cascade
}
{
:
}
{
:
|
*
?
:
3
2
1
PP*
VP
*
PP
NP
*
PP
S
L
NP
P
PP
L
ing
-
V
Aux
tns
V
VP
N
N
D
NP
L










Regular-Expression
Grammar
Syntactic Structure:
Finite-State Cascades (Abney)
• cascade consists of a sequence of levels
• phrases at one level are built on phrases at the
previous level
• no recursion: phrases never contain same level or
higher level phrases
• two levels of special importance
– chunks: non-recursive cores (NX, VX) of major phrases (NP,
VP)
– simplex clauses: embedded clauses as siblings
• patterns: reliable indicators of gist of syntactic
structure
Syntactic Structure:
Finite-State Cascades (Abney)
• each transduction is defined by a set of patterns
– category
– regular expression
• regular expression is translated into a finite-state automaton
• level transducer
– union of pattern automata
– deterministic recognizer
– each final state is associated with a unique pattern
• heuristics
– longest match (resolution of ambiguities)
• external control process
– if the recognizer blocks without reaching a final state,
a single input element is punted to the output and
recognition resumes at the following word
Syntactic Structure:
Finite-State Cascades (Abney)
• patterns: reliable indicators of bits of syntactic
structure
• parsing
– easy-first parsing (easy calls first)
– proceeds by growing islands of certainty into
larger and larger phrases
– no systematic parse tree from bottom to top
– recognition of recognizable structures
– containment of ambiguity
• prepositional phrases and the like are left unattached
• noun-noun modifications not resolved
Syntactic Structure:
Bounding of Centre Embedding
• Sproat, 2002
• observation: unbounded centre embedding
– does not occur in language use
– seems to be too complex for human mental capacities
• finite state modelling of bounded centre embedding
4
the
0 1
man
2
3
6
7
the
man
the
man 9
bites
8
5
dog
dog
dog
walkes walkes
bites
walkes
bites
walkes
bites
bites
walkes
S → the (man|dog) S1 (bites|walks)
S1 → the (man|dog) S2 (bites|walks)
S2 → the (man|dog) (bites|walks)
S1 → ε
S2 → ε
Modelling of Natural Language
Word Sequences: Approaches
Well-Formedness Conditions Finite-State Cascades Context-free and
mildly Context-
Sensitive
Syntactic
type-3 grammar
local sequences (chunks) lower level of cascade rules with terminal
symbols
global sequences (phrases) higher level of cascade rules with terminal
and non-terminal
symbols
beyond type-3 grammar approximation modelling
embeddings of phrases fixed number of levels variable number of
levels
centre embedding bounded centre embedding centre embedding
obligatorily paired correspondences individual patterns feature processing
beyond type-2 grammar
cross-serial dependencies - special
construction
Modelling of Natural Language
Word Sequences: Cases (1)
Well-Formedness
Conditions
Finite-State Cascades Context-free and mildly
Context-Sensitive
Semantic
ambiguity of
noun sequences
longest match heuristics overgeneration of reading
hypotheses
ambiguity of prepositional
phrase attachment
containment of ambiguity,
underspecification
overgeneration of reading
hypotheses
semantic compatibility of
words
(local grammars) (static) lexicalisation of
coocurrence restrictions
case frames procedural feature structure
meaning of words - -
Textual
information organisation
principles
selected patterns -
Modelling of Natural Language
Word Sequences: Cases (2)
• message understanding
– filling in relational database templates from newswire texts
– approach of FASTUS 1): cascade of five transducers
• recognition of names,
• fixed form expressions,
• basic noun and
verb groups
• patterns of events
– <company> <form><joint venture> with <company>
– "Bridgestone Sports Co. said Friday it has set up a joint venture in
Taiwan with a local concern and a Japanese trading house to
produce golf clubs to be shipped to Japan.”
• identification of event structures that describe the same event
Semantic Analysis
An example
1) Hobbs/Appelt/Bear/Israel/Kehler/Martin/Meyers/Kameyama/Stickel/Tyson (1997)
Relationship TIE-UP
Entities Bridgestone Sports Co.
a local concern
a Japanese trading house
JV Company -
Capitalization -
Summary: Linguistic Adequacy
• word formation
– essentially regular language
• sentence formation
– reduced recognition capacity (approximations)
• corresponds to language use
rather than natural language system
– flat interpretation structures
• clearly separate syntactic constraints from other (semantic, textual)
constraints
– partial interpretation structures
• clearly identify the contribution of syntactic structure in the interplay
with other structuring principles
• content
– suitable for restricted fact extraction
– deep text understanding generally still poorly understood
Summary: Practical Usefulness
• not all natural language phenomena can be
described with finite-state devices
• many actually occurring phenomena can be
described with regular devices
• not all practical applications require a
complete and deep processing of natural
language
• partial solutions allow for the development of
many useful applications
Summary: Complexity of Finite-
State Transducers for NLP
• theoretically: computationally intractable
(Barton/Berwick/Ristad,1987)
• SAT-problem is unnatural:
– natural language problems are bounded in size
• input and output alphabets,
• word length of linguistic words,
• partiality of functions and relations
– combinatorial possibilities are locally restricted.
• practically, natural language finite-state systems
– do not involve complex search
– are remarkably fast
– can in many relevant cases be determinised and minimised
Summary: Large Scale
Processing
• Context-free devices
– run-time complexity: |G| * n3 , |G| >> n3
– too slow for mass data processing
• Finite-state devices
– run-time complexity: best case: linear
(with low degree of non-determinism)
– best suited for mass data processing
References
Abney, Steven (1996). Tagging and Partial Parsing. In: Ken Church, Steve Young, and Gerrit Bloothooft (eds.),
Corpus-Based Methods in Language and Speech. Kluwer Academic Publishers, Dordrecht.
http://www.vinartus.net/spa/95a.pdf
Abney, Steven (1996a) Cascaded Finite-State Parsing. Viewgraphs for a talk given at Xerox Research Centre,
Grenoble, France. http://www.vinartus.net/spa/96a.pdf
Abney, Steven (1995). Partial Parsing via Finite-State Cascades. In: Journal of Natural Language Engineering,
2(4): 337-344. http://www.vinartus.net/spa/97a.pdf
Barton Jr., G. Edward; Berwick, Robert, C. und Eric Sven Ristad (1987). Computational Complexity and Natural
Language. MIT Press.
Beesley Kenneth R. und Lauri Karttunen (2003). Finite-State Morphology. Distributed for the Center for the Study
of Language and Information. (CSLI- Studies in Computational Linguistics)
Bod, Rens (1998). Beyond Grammar. An Experienced-Based Theory of Language. CSLI Lecture Notes, 88,
Standford, California: Center for the Study of Information and Language
Grefenstette, Gregory (1999). Light Parsing as Finite State Filtering. In: Kornai 1999, S. 86-94. earlier version in:
Workshop on Extended finite state models of language, Budapest, Hungary, Aug 11--12, 1996. ECAI'96.
http://citeseer.nj.nec.com/grefenstette96light.html
Hobbs, Jerry; Doug Appelt, John Bear, David Israel, Andy Kehler, David Martin, Karen Meyers, Megumi
Kameyama, Mark Stickel, Mabry Tyson (1997). Breaking the Text Barrier. FASTUS Presentation slides. SRI
International. http://www.ai.sri.com/~israel/Generic-FASTUS-talk.pdf
Jurafsky, Daniel und James H. Martin (2000): Speech and Language Processing. An Introduction to Natural
Language Processing, Computational Linguistics and Speech Recognition. New Jersey: Prentice Hall.
Kornai, András (ed.) (1999). Extended Finite State Models of Language. (Studies in Natural Language
Processing). Cambridge: Cambridge University Press.
Koskenniemi, Kimmo (1983). Two-level morphology: a general computational model for word-form recognition
and production. Publication 11, University of Helsinki. Helsinki: Department of Genral Linguistics
References
Kunze, Jürgen (2001). Computerlinguistik. Voraussetzungen, Grundlagen, Werkzeuge. Vorlesungsskript.
Humboldt Universität zu Berlin. http://www2.rz.hu-
berlin.de/compling/Lehrstuhl/Skripte/Computerlinguistik_1/index.html
Manning, Christopher D.; Schütze, Hinrich (1999). Foundations of Statistical Natural Language Processing.
Cambridge, Mass., London: The MIT Press. http://www.sultry.arts.usyd.edu.au/fsnlp
Mohri, Mehryar (1997). Finite State Transducers in Language and Speech Processing. In: Computational
Linguistics, 23, 2, 1997, S. 269-311. http://citeseer.nj.nec.com/mohri97finitestate.html
Mohri, Mehryar (1996). On some Applications of finite-state automata theory to natural language processing. In:
Journal of Natural Language Egineering, 2, S. 1-20.
Mohri, Mehryar und Michael Riley (2002). Weighted Finite-State Transducers in Speech Recognition (Tutorial).
Teil 1: http://www.research.att.com/~mohri/postscript/icslp.ps, Teil 2:
http://www.research.att.com/~mohri/postscript/icslp-tut2.ps
Partee, Barbara; ter Meulen, Alice and Robert E. Wall (1993). Mathematical Methods in Linguistics. Dordrecht:
Kluwer Academic Publishers.
Pereira, Fernando C. N. and Rebecca N. Wright (1997). Finite-State Approximation of Phrase-Structure
Grammars. In: Roche/Schabes 1997.
Roche, Emmanuel und Yves Schabes (Eds.) (1997). Finite-State Language Processing. Cambridge (Mass.) und
London: MIT Press.
Sproat, Richard (2002). The Linguistic Significance of Finite-State Techniques. February 18, 2002.
http://www.research.att.com/~rws
Strzalkowski, Tomek; Lin, Fang; Ge, Jin Wang; Perez-Carballo, Jose (1999). Evaluating Natural Language
Processing Techniques in Information Retrieval. In: Strzalkowski, Tomek (Ed.): Natural Language
Information Retrieval, Kluwer Academic Publishers, Holland : 113-145
Woods, W.A. (1970). Transition Network Grammar for Natural Language Analysis. In: Communications of the
ACM 13: 591-602.

More Related Content

Similar to haenelt.ppt

Info 2402 irt-chapter_4
Info 2402 irt-chapter_4Info 2402 irt-chapter_4
Info 2402 irt-chapter_4Shahriar Rafee
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsRoelof Pieters
 
Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Lifeng (Aaron) Han
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with PythonBenjamin Bengfort
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingTed Xiao
 
Natural language processing
Natural language processingNatural language processing
Natural language processingBasha Chand
 
Statistical Named Entity Recognition for Hungarian – analysis ...
Statistical Named Entity Recognition for Hungarian – analysis ...Statistical Named Entity Recognition for Hungarian – analysis ...
Statistical Named Entity Recognition for Hungarian – analysis ...butest
 
Deep Learning for Machine Translation - A dramatic turn of paradigm
Deep Learning for Machine Translation - A dramatic turn of paradigmDeep Learning for Machine Translation - A dramatic turn of paradigm
Deep Learning for Machine Translation - A dramatic turn of paradigmMeetupDataScienceRoma
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...RajkiranVeluri
 
Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...Ana Marasović
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processingMinh Pham
 
Hybrid Machine Translation by Combining Multiple Machine Translation Systems
Hybrid Machine Translation by Combining Multiple Machine Translation SystemsHybrid Machine Translation by Combining Multiple Machine Translation Systems
Hybrid Machine Translation by Combining Multiple Machine Translation SystemsMatīss ‎‎‎‎‎‎‎  
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introductionRobert Lujo
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingToine Bogers
 

Similar to haenelt.ppt (20)

Build your own ASR engine
Build your own ASR engineBuild your own ASR engine
Build your own ASR engine
 
Info 2402 irt-chapter_4
Info 2402 irt-chapter_4Info 2402 irt-chapter_4
Info 2402 irt-chapter_4
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...Pptphrase tagset mapping for french and english treebanks and its application...
Pptphrase tagset mapping for french and english treebanks and its application...
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Statistical Named Entity Recognition for Hungarian – analysis ...
Statistical Named Entity Recognition for Hungarian – analysis ...Statistical Named Entity Recognition for Hungarian – analysis ...
Statistical Named Entity Recognition for Hungarian – analysis ...
 
Deep Learning for Machine Translation - A dramatic turn of paradigm
Deep Learning for Machine Translation - A dramatic turn of paradigmDeep Learning for Machine Translation - A dramatic turn of paradigm
Deep Learning for Machine Translation - A dramatic turn of paradigm
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
 
Anabela Barreiro - Alinhamentos
Anabela Barreiro - AlinhamentosAnabela Barreiro - Alinhamentos
Anabela Barreiro - Alinhamentos
 
Cross language alignments - challenges guidelines and gold sets
Cross language alignments - challenges guidelines and gold setsCross language alignments - challenges guidelines and gold sets
Cross language alignments - challenges guidelines and gold sets
 
Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...Colloquium talk on modal sense classification using a convolutional neural ne...
Colloquium talk on modal sense classification using a convolutional neural ne...
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
 
Marek Rei - 2017 - Semi-supervised Multitask Learning for Sequence Labeling
Marek Rei - 2017 - Semi-supervised Multitask Learning for Sequence LabelingMarek Rei - 2017 - Semi-supervised Multitask Learning for Sequence Labeling
Marek Rei - 2017 - Semi-supervised Multitask Learning for Sequence Labeling
 
N20190729
N20190729N20190729
N20190729
 
Hybrid Machine Translation by Combining Multiple Machine Translation Systems
Hybrid Machine Translation by Combining Multiple Machine Translation SystemsHybrid Machine Translation by Combining Multiple Machine Translation Systems
Hybrid Machine Translation by Combining Multiple Machine Translation Systems
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 

Recently uploaded

Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptxPoojaSen20
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 

Recently uploaded (20)

Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 

haenelt.ppt

  • 1. Formal Languages FSAs in NLP Hinrich Schütze IMS, Uni Stuttgart, WS 2006/07 Most slides borrowed from K. Haenelt and E. Gurari
  • 2. Formal Language Theory • Two different goals in computational linguistics • Theoretical interest – What is the correct formalization of natural language? – What does this formalization tell us about the properties of natural language? – What are the limits of NLP algorithms – in principle? • E.g., natural language is context free would imply syntactic analysis of natural language is cubic. • Practical interest – Well-understood mathematical and computational framework for solving NLP problems – ... even if formalization is not „cognitively“ sound • Today: finite state for practical applications
  • 3. Language-Technology Based on Finite-State-Devices shallow parsing head-modifier- pairs tokenising speech recognition translation spelling correction dictionaries rules analysis synthesis transfer phonology morphology fact extraction text:speech speech:text part-of-speech tagging
  • 4. Advantages of Finite-State Devices • efficiency – time • very fast • if deterministic or low-degree non-determinism – space: • compressed representations of data • = search structure (= hash function) • system development and maintenance – modular design and automatic compilation of system components – high level specifications • language modelling – uniform framework for modelling dictionaries and rules
  • 5. Modelling Goal • specification of a formal language that corresponds as closely as possible to a natural language • ideally the formal system should – never undergenerate (i.e. accept or generate all the strings that characterise a natural language) – never overgenerate (i.e.not accept or generate any string which is not acceptable in a real language) • realistically – natural languages are moving targets (productivity, variations) – approximations are achievable – Finite-state is a crude, but useful approximation (Beesley/Karttunen, 2003)
  • 6. Syn Layers of Linguistic Modelling Lex The Commission promote information on product in third country The Commission promotes information on products in third countries noun dete verb noun prpo noun prpo noun NP VP NP PP PP S Text Commission promote information product third country on in Sem
  • 7. Main Types of Transducers for Natural Language Processing transducers with - input label - natural language - output label - interpretation - weights - variability of data, - ranking of hypotheses - n additional output strings at final states - delayed output of ambiguities - assignment of interpretations s ε s a w ε ee aw s a s a w w e a w e s s general case: - non-deterministic transducers with ε-transitions optimisation: - determinisation and minimisation
  • 8. Lexical Analysis of Natural Language Texts                       singular num pers present tense features verb category : 3 : : : : Lexical Analysis Tokenisation Morphological Analysis Identification of Basic Units (Words) Mapping to Normalised Forms Assignment of Categories Assigment of Features 4 t 0 1 h 2 3 t h i n 5 6 k s  i n k Recogn. Input L Mapping to Output Language
  • 9. Tokenization • Just use white space? • Tokenization rules?
  • 10. Properties of Natural Language Words • very large set (how many?) • word formation: concatenation with constraints – simple words word *dorw – compound word • productive compounding and derivations Drosselklappenpotentiometer organise → organisation → organisational, re-organise → re-organisation → re-organisational be-wald-en *be-feld-en • contiguous dependencies go-es *walk-es un-expect-ed-ly *un-elephant-ed-ly • discontiguous (long distance) dependencies expect-s *un-expect-s mach-st *ge-mach-st
  • 11.
  • 12. Modelling of Natural Language Words • Concatenation rules expressed in terms of – meaningful word components (including simple words) (morphs) – and their concatenation (morphotactics) • Modelling approach – lexicalisation of sets of meaningful components (morph classes) – representation of these dictionaries with finite-state transducers – specification of concatenation w o w o r k r k i n g e d s b o b o o k o k
  • 13. Modelling of Natural Language Words: System Overview affix (ε:+noun +sg) <#> (s:+noun +pl) <#> morpheme classes morphotactics phon./orthograph. alternation rules special expressions lexicon compiler lexical transducer stem (book:book) <affix> (box:box) <affix> # _ ^ / s z x s e             ([0-9]2“.“)219[0-9]2 ... concatenation of morpheme classes
  • 15. r Modelling of Words: Standard Case noun-stem book <noun-suffix> work <noun-suffix> noun-suffix (ε:+N +Sg) # (s:+N +Pl) # w o w o r ε s b o b o o k o k +N+Sg +N+Pl Lexical Transducer -deterministic -minimal - additional output at final states
  • 16. Modelling of Words: Mapping Ambiguities between i:o-Language leave leave leaves left left ε ε f t l e l e a v a v ave ε s e e ft Lexical Transducer -deterministic -delayed emission at final state -minimal
  • 17. Modelling of Words: Overlapping of Matching Patterns Wach-stube Wachs-tube s t w a c h s Lexical Transducer -non-deterministic with ε-transitions: - determinisation not possible - infinite delay of output due to cycle u b e t ε:- Ambiguities - cannot be resolved at lexical level - must be preserved for later analysis steps require non-deterministic traversal
  • 18. Modelling of Words: Non-Disjoint Morpheme Classes • problem: – multiplication of start sections – high degree of non-determinism • solutions – pure finite-state devices • transducer with high degree of non-determinism – heavy backtracking / parallel search – slow processing • determinisation – explosion of network in size – extendend device: feature propogation along paths • merging of morpheme classes • reduction of degree of non-determinsms • minimisation • additional checking of bit-vectors
  • 19. Modelling of Words: Non-Disjoint Morpheme Classes: high degree of non-determinism noun-stem book <noun-stem> <noun-suffix> work <noun-stem> <noun-suffix> verb-suffix (ε:+V) # (ed:+V +past) # (s:+V +3rd) # N book work V book work ε:+V vs ed:+V s:+V ns ε:+N s:+N ε ε ε ε ε noun-suffix (ε:+N +Sg) # (s:+N +Pl) # verb-stem book <verb-suffix> work <verb-suffix> problem: - each of the subdivisions must be searched separateley - (determinization not feasible: -leads to explosion of network or -not possible)
  • 20. Modelling of Words: Non-Disjoint Morpheme Classes: feature propagation noun-stem book <noun-stem> <noun-suffix> work <noun-stem> <noun-suffix> verb-suffix (ε:+V) # (ed:+V +past) # (s:+V +3rd) # noun-suffix (ε:+N +Sg) # (s:+N +Pl) # verb-stem book <verb-suffix> work <verb-suffix> book {N,V} work {N,V} ε {N,V} ed {V} s {N,V} ε {N} solution: - interpreting continuation class as bit-feature - added to the arcs - checked during traversal (feature intersection) - merging the dictionaries - searching only one dictionary (degree of non-determinism reduced considerably)
  • 21. Modelling of Words: Long-Distance Dependencies       present tense verb category : :   participle category: *ge-mach- *ge-mach- e st *ge-wach-st ge-mach-t ge-wachs-t mach- mach- mach- mach- mach- mach- ... wachs-e wach-e    e st t en t en Contiguous Constraints Discontiguous Constraints invalid sequences Constraints on the co-occurrence of morphs within words
  • 22. Modelling of Words: Long-Distance Dependencies modelling alternatives • pure finite-state transducer – copies of the network – can cause explosion in the size of the resulting transducer • extended finite-state transducer – „context-free“ extension: simple memory: flags – special treatment by analysis and generation routine – keeps transducers small
  • 23. Modelling of Words: Long-Distance Dependencies: network copies ε ε verb-stems verb-stems ge inflection suffixes en et n t > 15.000 entries problem: can cause explosion in the size of the resulting transducer
  • 24. Modelling of Words: Long-Distance Dependencies: simple memory: flags verb-stems ε ge inflection suffixes en et n t solution: -state flags - procedural interpretation (bit vector operations) @set(+ge) @require(+ge) @require(-ge) Beesley/Karttunen, 2003
  • 25. Modelling of Words: Phon./Orth. Alternation Rules • phenomena – pity → pitiless – fly → flies – swim → swimming – delete → deleting – fox → foxes • dictionary • rules: high level specification of regular expressions • . noun-stem dog <noun-suffix> fox <noun-suffix> noun-suffix (ε:+N +Sg) # (s:+N +Pl) # # __ ^ / s z s x e             a→b_c [[[[?* b] a ?*] | [?* a [c ?*]]]] Beesley/Karttunen, 2003: 61
  • 26. Modelling of Words: Phon./Orth. Alternation Rules • compilation of lexical transducer: – construction of dictionary transducer – construction of rule transducer – composition of lexical transducer and rule transducer
  • 27. Modelling of Words: Phon./Orth. Alternation Rules Jurafsky/Martin, 2000, S. 78 # __ ^ / s z s x e             r0 r1 r2 r3 r4 r5 s # :e  ^: z,x z,s,x #,other z,s,x ^: # other ^: s z,s,x #,other #,other e-insertion rule for English plural nouns ending with x,s,z („foxes“)
  • 28.
  • 29.
  • 31.
  • 32. • Compute output for „aabba“
  • 33.
  • 34. r w o w o r ε s b o b o o k o k +N+Sg +N+Pl Tisch/Tische Rat/Räte
  • 35. • Numbers to Numerals for English (1-99)
  • 37. Modelling of Words: Phon./Orth. Alternation Rules 61 51 20 10 72 73 84 90 30 40 50 60 70 81 00 +N:ε f:f o:o x:x +Pl:^ ε:s ε:# d:d o:o g:g dictionary ε:e ^:ε ^:ε ^:ε 1 2 3 4 5 s # z,x z,s,x #,other z,s,x other s z,s,x #,other #,other 0 #, 0 1 2 5 6 7 8 9 3 4 dictionary ○ rule   rule D   D +Pl:^ *:* o:o g:g *:* +N:ε ^:ε s:s ε:s #:# ε:# d:d *:* : *:* *:* x:x o:o x:x +N:ε +Pl:^ ^:ε ε:e s:s ε:s #:# ε:# f f
  • 38. Time complexity of transducer?
  • 39. Syn Layers of Linguistic Modelling Lex The Commission promote information on product in third country The Commission promotes information on products in third countries noun dete verb noun prpo noun prpo noun NP VP NP PP PP S Text Commission promote information product third country on in Sem
  • 40. Syntactic Analysis of Natural Language Texts Syntactic Analysis Input Language Output Language Identification of well-formed sequences of words Assignment of Syntactic Categories Assigment of Syntactic Structure the good example [NP NP NP] the good example NP
  • 41. Sequences of Words: Properties and Well-Formedness Conditions syntactic conditions: type-3 regular language concatenations • local word ordering principles – the good example * example good the – could have been done *been could done have • global word ordering principles – (we) (gave) (him) (the book) * (gave) (him) (the book) (we)
  • 42. concatenations beyond regular languages • centre embedding (S → a S b) • obligatorily paired correspondences • either ... or, if ... then • can be nested inside each other the regulation defines which the commission had formulated which the Council had elected . . . Sequences of Words: Properties and Well-Formedness Conditions syntactic conditions: beyond type-3
  • 43. Sequences of Words: Properties and Well-Formedness Conditions syntactic conditions: beyond type-2 • concatenations beyond context-free languages – cross-serial dependencies Jan säit das mer d’chind em Hans es huus lönd hälfe aastriiche x1 x2 x3 y1 y2 y3 John said that we the children-acc let Hans-dat help the house paint
  • 44. Syntactic Grammars • complete parsing – goal: recover complete, exact parses of sentences – closed-world assumption • lexicon and grammar are complete • place all types of conditions into one grammar • seeking the globally best parse of the entire search space – problems • not robust • too slow for mass data processing • partial parsing – goal: recover syntactic information efficiently and reliably from unrestricted text – sacrificing completeness and depth of analysis – open-world assumption • lexicon and grammar are incomplete • local decisions Abney, 1996
  • 45. Syntactic Grammars: Complete Sentence Structure ? with stopper fastening cork material or made of with bottle closed PP PP AP NP NP-c AP NP syntactic link semantic link text structure
  • 46. Syntactic Grammars: Complete Sentence Parsing computational problem • combinatorial explosion of readings List the sales of products in 1973 3 readings List the sales of products produced in 1973 10 readings List the sales of products in 1973 with the products in 1972 28 readings List the sales of products produced in 1973 with the products produced in 1972 455 readings Bod, 1998: 2
  • 47. „All Grammars Leak“* • Not possible to provide an exact and complete characterization – of all well-formed utterances – that cleanly divides them from all other sequences of words which are regarded as ill-formed utterances • Rules are not completely ill-founded • Somehow we need to make things looser to account for the creativity of language use *Edward Sapir, 1921
  • 48. All Grammars Leak • Example for „leaking“ rule?
  • 49. All Grammars Leak • Agreement in English • “Why do some teachers, parents and religious leaders feel that celebrating their religious observances in home and church are inadequate and deem it necessary to bring those practices into the public schools?”
  • 50. Syntactic Structure: Partial Parsing Approaches • finite-state approximation of sentence structures (Abney 1995) – finite-state cascades: sequences of levels of regular expressions – recognition approximation: tail-recursion replaced by iteration – interpretation approximation: embedding replaced by fixed levels
  • 51. Syntactic Structure: Finite State Cascades • functionally equivalent to composition of transducers, – but without intermediate structure output – the individual transducers are considerably smaller than a composed transducer the good example [NP NP NP] 2 1 T T  the good example dete adje nomn 1 T [NP NP NP] dete adje nomn 2 T
  • 52. Syntactic Structure: Finite-State Cascades (Abney) D N P D N N V-tns Pron the woman in the lab coat thought you Aux V-ing were sleeping NP P NP VP NP VP NP PP VP NP VP S S L2 ---- L1 ---- L0 ---- L3 ---- T2 T1 T3 Finite-State Cascade } { : } { : | * ? : 3 2 1 PP* VP * PP NP * PP S L NP P PP L ing - V Aux tns V VP N N D NP L           Regular-Expression Grammar
  • 53. Syntactic Structure: Finite-State Cascades (Abney) • cascade consists of a sequence of levels • phrases at one level are built on phrases at the previous level • no recursion: phrases never contain same level or higher level phrases • two levels of special importance – chunks: non-recursive cores (NX, VX) of major phrases (NP, VP) – simplex clauses: embedded clauses as siblings • patterns: reliable indicators of gist of syntactic structure
  • 54. Syntactic Structure: Finite-State Cascades (Abney) • each transduction is defined by a set of patterns – category – regular expression • regular expression is translated into a finite-state automaton • level transducer – union of pattern automata – deterministic recognizer – each final state is associated with a unique pattern • heuristics – longest match (resolution of ambiguities) • external control process – if the recognizer blocks without reaching a final state, a single input element is punted to the output and recognition resumes at the following word
  • 55. Syntactic Structure: Finite-State Cascades (Abney) • patterns: reliable indicators of bits of syntactic structure • parsing – easy-first parsing (easy calls first) – proceeds by growing islands of certainty into larger and larger phrases – no systematic parse tree from bottom to top – recognition of recognizable structures – containment of ambiguity • prepositional phrases and the like are left unattached • noun-noun modifications not resolved
  • 56. Syntactic Structure: Bounding of Centre Embedding • Sproat, 2002 • observation: unbounded centre embedding – does not occur in language use – seems to be too complex for human mental capacities • finite state modelling of bounded centre embedding 4 the 0 1 man 2 3 6 7 the man the man 9 bites 8 5 dog dog dog walkes walkes bites walkes bites walkes bites bites walkes S → the (man|dog) S1 (bites|walks) S1 → the (man|dog) S2 (bites|walks) S2 → the (man|dog) (bites|walks) S1 → ε S2 → ε
  • 57. Modelling of Natural Language Word Sequences: Approaches
  • 58. Well-Formedness Conditions Finite-State Cascades Context-free and mildly Context- Sensitive Syntactic type-3 grammar local sequences (chunks) lower level of cascade rules with terminal symbols global sequences (phrases) higher level of cascade rules with terminal and non-terminal symbols beyond type-3 grammar approximation modelling embeddings of phrases fixed number of levels variable number of levels centre embedding bounded centre embedding centre embedding obligatorily paired correspondences individual patterns feature processing beyond type-2 grammar cross-serial dependencies - special construction Modelling of Natural Language Word Sequences: Cases (1)
  • 59. Well-Formedness Conditions Finite-State Cascades Context-free and mildly Context-Sensitive Semantic ambiguity of noun sequences longest match heuristics overgeneration of reading hypotheses ambiguity of prepositional phrase attachment containment of ambiguity, underspecification overgeneration of reading hypotheses semantic compatibility of words (local grammars) (static) lexicalisation of coocurrence restrictions case frames procedural feature structure meaning of words - - Textual information organisation principles selected patterns - Modelling of Natural Language Word Sequences: Cases (2)
  • 60. • message understanding – filling in relational database templates from newswire texts – approach of FASTUS 1): cascade of five transducers • recognition of names, • fixed form expressions, • basic noun and verb groups • patterns of events – <company> <form><joint venture> with <company> – "Bridgestone Sports Co. said Friday it has set up a joint venture in Taiwan with a local concern and a Japanese trading house to produce golf clubs to be shipped to Japan.” • identification of event structures that describe the same event Semantic Analysis An example 1) Hobbs/Appelt/Bear/Israel/Kehler/Martin/Meyers/Kameyama/Stickel/Tyson (1997) Relationship TIE-UP Entities Bridgestone Sports Co. a local concern a Japanese trading house JV Company - Capitalization -
  • 61. Summary: Linguistic Adequacy • word formation – essentially regular language • sentence formation – reduced recognition capacity (approximations) • corresponds to language use rather than natural language system – flat interpretation structures • clearly separate syntactic constraints from other (semantic, textual) constraints – partial interpretation structures • clearly identify the contribution of syntactic structure in the interplay with other structuring principles • content – suitable for restricted fact extraction – deep text understanding generally still poorly understood
  • 62. Summary: Practical Usefulness • not all natural language phenomena can be described with finite-state devices • many actually occurring phenomena can be described with regular devices • not all practical applications require a complete and deep processing of natural language • partial solutions allow for the development of many useful applications
  • 63. Summary: Complexity of Finite- State Transducers for NLP • theoretically: computationally intractable (Barton/Berwick/Ristad,1987) • SAT-problem is unnatural: – natural language problems are bounded in size • input and output alphabets, • word length of linguistic words, • partiality of functions and relations – combinatorial possibilities are locally restricted. • practically, natural language finite-state systems – do not involve complex search – are remarkably fast – can in many relevant cases be determinised and minimised
  • 64. Summary: Large Scale Processing • Context-free devices – run-time complexity: |G| * n3 , |G| >> n3 – too slow for mass data processing • Finite-state devices – run-time complexity: best case: linear (with low degree of non-determinism) – best suited for mass data processing
  • 65. References Abney, Steven (1996). Tagging and Partial Parsing. In: Ken Church, Steve Young, and Gerrit Bloothooft (eds.), Corpus-Based Methods in Language and Speech. Kluwer Academic Publishers, Dordrecht. http://www.vinartus.net/spa/95a.pdf Abney, Steven (1996a) Cascaded Finite-State Parsing. Viewgraphs for a talk given at Xerox Research Centre, Grenoble, France. http://www.vinartus.net/spa/96a.pdf Abney, Steven (1995). Partial Parsing via Finite-State Cascades. In: Journal of Natural Language Engineering, 2(4): 337-344. http://www.vinartus.net/spa/97a.pdf Barton Jr., G. Edward; Berwick, Robert, C. und Eric Sven Ristad (1987). Computational Complexity and Natural Language. MIT Press. Beesley Kenneth R. und Lauri Karttunen (2003). Finite-State Morphology. Distributed for the Center for the Study of Language and Information. (CSLI- Studies in Computational Linguistics) Bod, Rens (1998). Beyond Grammar. An Experienced-Based Theory of Language. CSLI Lecture Notes, 88, Standford, California: Center for the Study of Information and Language Grefenstette, Gregory (1999). Light Parsing as Finite State Filtering. In: Kornai 1999, S. 86-94. earlier version in: Workshop on Extended finite state models of language, Budapest, Hungary, Aug 11--12, 1996. ECAI'96. http://citeseer.nj.nec.com/grefenstette96light.html Hobbs, Jerry; Doug Appelt, John Bear, David Israel, Andy Kehler, David Martin, Karen Meyers, Megumi Kameyama, Mark Stickel, Mabry Tyson (1997). Breaking the Text Barrier. FASTUS Presentation slides. SRI International. http://www.ai.sri.com/~israel/Generic-FASTUS-talk.pdf Jurafsky, Daniel und James H. Martin (2000): Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. New Jersey: Prentice Hall. Kornai, András (ed.) (1999). Extended Finite State Models of Language. (Studies in Natural Language Processing). Cambridge: Cambridge University Press. Koskenniemi, Kimmo (1983). Two-level morphology: a general computational model for word-form recognition and production. Publication 11, University of Helsinki. Helsinki: Department of Genral Linguistics
  • 66. References Kunze, Jürgen (2001). Computerlinguistik. Voraussetzungen, Grundlagen, Werkzeuge. Vorlesungsskript. Humboldt Universität zu Berlin. http://www2.rz.hu- berlin.de/compling/Lehrstuhl/Skripte/Computerlinguistik_1/index.html Manning, Christopher D.; Schütze, Hinrich (1999). Foundations of Statistical Natural Language Processing. Cambridge, Mass., London: The MIT Press. http://www.sultry.arts.usyd.edu.au/fsnlp Mohri, Mehryar (1997). Finite State Transducers in Language and Speech Processing. In: Computational Linguistics, 23, 2, 1997, S. 269-311. http://citeseer.nj.nec.com/mohri97finitestate.html Mohri, Mehryar (1996). On some Applications of finite-state automata theory to natural language processing. In: Journal of Natural Language Egineering, 2, S. 1-20. Mohri, Mehryar und Michael Riley (2002). Weighted Finite-State Transducers in Speech Recognition (Tutorial). Teil 1: http://www.research.att.com/~mohri/postscript/icslp.ps, Teil 2: http://www.research.att.com/~mohri/postscript/icslp-tut2.ps Partee, Barbara; ter Meulen, Alice and Robert E. Wall (1993). Mathematical Methods in Linguistics. Dordrecht: Kluwer Academic Publishers. Pereira, Fernando C. N. and Rebecca N. Wright (1997). Finite-State Approximation of Phrase-Structure Grammars. In: Roche/Schabes 1997. Roche, Emmanuel und Yves Schabes (Eds.) (1997). Finite-State Language Processing. Cambridge (Mass.) und London: MIT Press. Sproat, Richard (2002). The Linguistic Significance of Finite-State Techniques. February 18, 2002. http://www.research.att.com/~rws Strzalkowski, Tomek; Lin, Fang; Ge, Jin Wang; Perez-Carballo, Jose (1999). Evaluating Natural Language Processing Techniques in Information Retrieval. In: Strzalkowski, Tomek (Ed.): Natural Language Information Retrieval, Kluwer Academic Publishers, Holland : 113-145 Woods, W.A. (1970). Transition Network Grammar for Natural Language Analysis. In: Communications of the ACM 13: 591-602.