This thesis contributes to the investigation of the sound-to-grammar mapping by developing a computational model in which complex acoustic patterns can be represented conveniently, and exploited for simulating the prediction of English prefixes by human listeners.
The model is rooted in the principles of rational analysis and Firthian prosodic analysis, and formulated in Bayesian terms. It is based on three core theoretical assumptions: first, that the goals to be achieved and the computations to be performed in speech recognition, as well as the representation and processing mechanisms recruited, crucially depend on the task a listener is facing, and on the environment in which the task occurs. Second, that whatever the task and the environment, the human speech recognition system behaves optimally with respect to them. Third, that internal representations of acoustic patterns are distinct from the linguistic categories associated with them.
The representational level exploits several tools and findings from the fields of machine learning and signal processing, and interprets them in the context of human speech recognition. Because of their suitability for the modelling task at hand, two tools are dealt with in particular: the relevance vector machine (Tipping, 2001), which is capable of simulating the formation of linguistic categories from complex acoustic spaces, and the auditory primal sketch (Todd, 1994), which is capable of extracting the multi-dimensional features of the acoustic signal that are connected to prominence and rhythm, and represent them in an integrated fashion. Model components based on these tools are designed, implemented and evaluated.
The implemented model, which accepts recordings of real speech as input, is compared in a simulation with the qualitative results of an eye-tracking experiment. The comparison provides useful insights about model behaviour, which are discussed.
Throughout the thesis, a clear distinction is drawn between the computational, representational and implementation devices adopted for model specification.