1. Presenting a new type of usage-based approach to grammatical constructions Toward a pattern-based analysis of English resultatives: Keio University Masato YOSHIKAWA April 24th, 2010 ELSJ International Spring Forum 2010
3. 1.1. Outline ELSJ International Spring Forum 2010 Theme The Resultative Construction(RC, henceforth; e.g., (1)) (1) John hammered the metal flat. Position Usage-based view (e.g., Kemmer & Barlow 2000; Langacker 1987) Based on Pattern Lattice Model(Kuroda & Hasebe 2009; Kuroda 2009), a radically memory-based/exemplar-based model of language Methodology a quantitative research using the RC database collected by Boas (2003). Conclusion RC is a “mosaic” of partially similar conventional phrases 3
4. 1.2. The aim of this talk The aims of this talk To show the possibility of a new approach to grammatical constructions which is based on the Usage-based view; Suggestion: “reductionist” approaches should not work To contribute to a “memory-based”or “exemplar-based”theory of human linguistic knowledge (e.g., Bod 2006; Pierrehumbert 2001; Port 2007) What is implied Constructions of abstract kind =psychologically unreal!? Grammar = an epiphenomenon derived from analogical applicationsof conventionalized expressions!? ELSJ International Spring Forum 2010 4
5. 1.3. The organization of this talk Section 2 Provides a brief sketch of Pattern Lattice Model (PLM) Section3 Reports the detail of the quantitative research Section 4 Discusses the results of the research Section 5 Summarizes the whole discussion; Remarks on the remaining problems Section 6 Acknowledgements and additional references ELSJ International Spring Forum 2010 5
7. 2.1. Pattern Lattice Model (PLM) Pattern Lattice Model (PLM, Kuroda & Hasebe 2009; Kuroda 2009) Assumption 1: the linguistic knowledge we have in mind = a collection of concrete exemplars of linguistic experiences Exemplars are considered almost equivalent to what we call “episodes” (e.g., Tulving 2002) The underlying idea: the hypothesis of “full memory” Assumption 2: Those exemplars are connected to vast number of “indices” Indices = any kinds of abstract units (e.g., phonemes, morphemes, lexemes, etc.) As for syntax: the relevant indices = “patterns” whose definition is given below ELSJ International Spring Forum 2010 7
8. 2.2. Patterns [1/3] Where do patterns come from? Segment an exemplar e (e.g., (1a)) into arbitrary size of units and make T(e) (e.g., (1b)) (1) a. John hammered the metal flat. b. [John, hammered, the metal, flat] ELSJ International Spring Forum 2010 8 John hammered the metal flat hammered the metal flat = e John segmentation = T(e) hammered the metal flat John
9. 2.2. Patterns [2/3] Where do patterns come from? Replace each segment with a variable X (shown here as “_”) The products of this procedure = patterns {[ _, hammered, the metal, flat], [ John, _, the metal, flat], [ John, hammered, _, flat], [ John, hammered, the metal, _ ]} ELSJ International Spring Forum 2010 9 hammered the metal flat John hammered the metal flat __ Patterns __ the metal flat John hammered __ flat John hammered the metal __ John
10. 2.2. Patterns [3/3] Where do patterns come from? Perform the replacement recursively until all the segments are replaced with variables The result = the pattern set P for e =P (e) ELSJ International Spring Forum 2010 10
11. 2.3. Pattern Lattice What is Pattern Lattice (PL)? A hierarchical network of patterns The partially-ordered set where “≤” = “is-a” relation Is-a relation here: For pi , pj∈ P, pi is-a pj when pj matches pi x = [a, b, _, d], y = [ a, _, _, d] y matches x ⇒ x is-a y The TOP of PL = a pattern composed only of variable(s) The BOTTOMof PL = a set of exemplar(s) Shown diagrammatically in the next slide ELSJ International Spring Forum 2010 11
12. ELSJ International Spring Forum 2010 The Hasse diagram of PL 12 Created by using Pattern Lattice Builder (http://www.kotonoba.net/rubyfca/) RANK
13. 2.4. Why PLM? PLM gives us A solid foundation for the usage-based view of language; A simple but powerful algorithm of pattern generation; This means: the current Usage-based Model (e.g., Langacker 2000) = insufficient A pattern-based analysis = an approach based on PLM Note PLM = only the beginning! We need: Additional procedure which tells us which patterns are useful ELSJ International Spring Forum 2010 13
15. 3.1. Data RC database collected by Boas (2003) Containing about 6000 examples of RCs obtained from British National Corpus (BNC) Downloadable at http://cslipublications.stanford.edu/hand/1575864088appendix.pdf Manual coding Each sentence annotate with 1) the head noun of Argument 1 = “Object” if transitive/“Subject” if intransitive 2) the head noun of Argument 2 = “Subject” if transitive/NONE if intransitive 3) the verb 4) the resultative predicate ELSJ International Spring Forum 2010 15
16. 3.1. Data in detail [1/4] ELSJ International Spring Forum 2010 16
17. 3.1. Data in detail [2/4] ELSJ International Spring Forum 2010 17
18. 3.1. Data in detail [3/4] ELSJ International Spring Forum 2010 18
19. 3.1. Data in detail [4/4] ELSJ International Spring Forum 2010 19
20. 3.2. Method VP Extraction Extract VP from manually-coded data Tally the number of different VPs Patterngeneration Input the VPs into self-made Python script to get patterns The tool employed ≠what is shown in ABSTRACT Python’s version: 2.6.5; Windows ver. Calculate z-score of each pattern pi.e., z(p) f(p) = the frequency of p; f(k) = the average frequency of the rank k s(k) = the standard deviation of the frequency of the rank k z-score tells us how productive and conventional a pattern is ELSJ International Spring Forum 2010 20
21. 3.3. Results [1/2] Overview 3,376 different VPs 11,392 patterns* Notice! Different from the number shown in ABSTRACT The “top” pattern: “shoot __ dead” (z = 43.6) “Superior” patterns Shown in the right table Notice! Different from the table show in ABSTRACT ELSJ International Spring Forum 2010 21
24. 4.1. Variety of slot positions Inconsistency of slot positions As for the top 100 patterns: V = “X _ _”: 5 pattern types O = “_ Y _”: 6 pattern types R = “_ _ Z”:7 pattern types VO = “X Y _”: 8 pattern types OR = “_ Y Z”:13 pattern types VR = “X _ Z”: 29 pattern types VOR = “X Y Z”:32 pattern types Overall (for the patterns whose z ≥ 1) V= 20; O = 10; R = 16; VO = 38; OR = 51; VR = 93; VOR = 106 This may mean: The resultative construction = inconsistent set?? ELSJ International Spring Forum 2010 24
25. 4.2. Remarks Ubiquitous Super-Lexical patterns VO, OR, VR, and VOR are ubiquitous Suggestion: RC = irreducible to lexical factors!? One possibility: RC = a mosaic of conventional patterns Bonus Additional examples (found in Corpus of Contemporary American English, COCA: Davies 2008-) “_ door open” creak door open, buzz door open, etc. RCs with additional verbs “beat _ _” beat ~ senseless New RP Note: Examples with the verb make ≠ RC!? ELSJ International Spring Forum 2010 25
27. 5.1. Summary of this research This talk presents A quantitative research of the Resultative Construction (RC) Under the radically usage-based model called Pattern Lattice Model (PLM) Findings Slot position of the patterns = highly inconsistent Productive patterns of RC = highly lexically-specific = concrete Conclusion RC = a mosaic of conventional patterns (e.g., shoot _ dead, _ door open, drive me mad, etc) But unfortunately this is only a suggestion… ELSJ International Spring Forum 2010 27
28. 5.2. Remaining problems “Semi-”concreteness The inputs employed to generate patterns = abstract arrays (= VOR) ≠ concrete item sequences (e.g., raw sentences) This means: this research = NOT entirely usage-based No direct references to psychological reality Only the result of corpus research was provided Psychological experiment (or the like) will be needed ELSJ International Spring Forum 2010 28
30. 6.1. Acknowledgements Prof. Ippei INOUE (Keio University) Mr. Fuminori NAKAMURA (Keio Univeristy) ELSJ International Spring Forum 2010 30
31. 6.2. References Boas, H. 2003. A constructional approach to resultatives. Stanford: CSLI publications. Bod, R. 2006. Exemplar-based syntax: How to get productivity from examples. The linguistic review, 23, 291-320. Davies, M. 2008-. The Corpus of Contemporary American English (COCA): 400+ million words,1990-present. Available online at http://www.americancorpus.org. Kemmer, S., & Barlow, M. 2000. Introduction: A usage-based conception of language. In Barlow, M., &. Kemmer, S. (eds.) Usage-based models of language (pp. vii-xxii). Stanford: CSLI Publications. Kuroda, K. 2009. Pattern lattice as a model of linguistic knowledge and performance. Proceedings of The 23rd Pacific Asia Conference on Language, Information and Computation. Kuroda, K. and Hasebe, Y. 2009. Modeling (Human) Knowledge and Processing of Natural Language Using Pattern Lattice. 15th Annual Meeting of Japanese Society of Natural Language Processing, 670‒673. Langacker, R. 1987. Foundations of cognitive grammar Vol. 1: Theoretical prerequisites. Stanford: Stanford University Press. — — . 2000. A dynamic usage-based model. In Barlow, M., &. Kemmer, S. (eds.) (pp. 1- 63). Pierrehumbert, J. 2001. Exemplar dynamics: Word frequency, lenition and contrast. In Bybee, J., & Hopper, P. (eds.) Frequency and the emergence of linguistic structure (pp. 137-157). Amsterdam: John Benjamins. Port, R. 2007. How words are stored in memory: Beyond phones and phonemes. New Ideas in Psychology, 25, 143-170. Tulving, E. 2002. Episodic memory: From mind to brain. Annual Review of Psychology, 53, 1–25. ELSJ International Spring Forum 2010 31