SlideShare a Scribd company logo
1 of 31
Download to read offline
What are the functional units in
           reading?


               Evidence for statistical
               variation influencing
               reading



                            Alastair Smith
                        Padraic Monaghan
The Debate

   What information is used to map orthography
    onto phonology?




    The Debate   Theoretical Models   Contrasting Evidence   Study   Discussion
Competing Models

   Dual-Route Models
     –   Dual-Route Cascade Model (Coltheart et al, 1993)
     –   Connectionist Dual Process Model (Zorzi et al, 1998)
     –   CDP+ (Perry, Zorzi & Ziegler, 2007)

   Single Route Models
     –   Parallel Distributed Processing Model
     –   Seidenberg & McClelland, 1989
     –   Plaut, McClelland, Seidenberg & Patterson, 1996
     –   Harm & Seidenberg, 1999
    The Debate   Theoretical Models   Contrasting Evidence   Study   Discussion
Dual-Route Models

    Lexical Route

    Sub-lexical Route

    Serial Processing

    Explicit level of
     representation for
     graphemes
                                         The dual-route cascaded model.
                                         From The CDP+ Model of Reading Aloud.
                                         By Perry, C., Ziegler, J.C. & Zorzi, M., 2007
                                         Psychological Review, 114, p.275.




    The Debate   Theoretical Models   Contrasting Evidence                         Study   Discussion
Single Route Models
                                                             The ‘triangle’ model.

    Parallel Processing                                     From Computing the meanings of words in reading.
                                                             By Harm M.W., Seidenberg M.S., 2004
                                                             Psychological Review, 111, p.663.




    Encodes statistical
     relations between
     patterns of letters and
     their pronunciation

    Single letters provide
     input

    The Debate   Theoretical Models   Contrasting Evidence       Study              Discussion
Psycholinguistic Grain Size Theory
(Ziegler & Goswami, 2005)

   Type of processing that occurs in reading system
    determined by statistical relations between orthography
    and phonology.

   Grain sizes:
      –   Language specific
      –   Allow for efficient mapping


   Learning to read is learning to find shared grain sizes in
    orthography and phonology.
    The Debate    Theoretical Models    Contrasting Evidence   Study   Discussion
Graphemes

   Written representations of phonemes

   Can be composed of multiple letters:
    – Digraphs
    – Trigraphs




From Exploring Grain-Size Effects in Reading.
By Pagliuca, G., Monaghan, P., 2008
Proc 30th Ann Conf Cog Sci Soc. Mahwah, NJ:
Lawrence Erlbaum.



    The Debate                   Theoretical Models   Contrasting Evidence   Study   Discussion
Whammies and Double Whammies
(Rastle & Coltheart, 1998)

   Participants read non-words containing 3 phonemes (e.g. fooce)
    slower than control non-words containing 5 graphemes (e.g. fruls)

   Behavioural study supported by simulation data from dual route
    model
      –   Non-lexical route processes non-word serially left to right, letter by
          letter

   Conclusions:
      –   Reading system
          is serial
      –   Functional unit is the
          letter, not the digraph                  From Whammies and double whammies.
                                                      By Rastle, K., Coltheart, M., 1998
                                                        Psychon Bull Rev, 5, 277-282


    The Debate      Theoretical Models   Contrasting Evidence                Study         Discussion
Grain-Size Effects in Reading
(Pagliuca, Monaghan & McIntosh, 2008)

   Findings seem to contradict those of Rastle & Coltheart, 1998

   Indicates grain-size adapts according to statistics in the
    orthography - phonology mapping

   Hypothesis:
      –   If graphemes are functional units within the reading system, then a
          word containing a multi-letter grapheme should be read more
          accurately than a word without given the same kind of perceptual
          noise to impair the orthographic input.

   Modelling data from single route model supported by behavioural
    study

    The Debate     Theoretical Models   Contrasting Evidence   Study   Discussion
Modelling:
Pagliuca, Monaghan & McIntosh, 2008

   Single route model based on Harm & Seidenberg, 1999
   Orthographic input represented by 8 letter slots
   Activation from input letter slots reduced along monotonic gradient from left to right
    so that the lowest level of activation was in the left most slot
      –   two severities of impairment applied, severe and mild

   Model tested on two sets of 62 words, all 5 letters in length and monosyllabic
      –   Set 1: Digraphs in initial position
               ch, sh, th
      –   Set 2: Control set (no digraphs)
               cr, st, tr                                                        From Exploring Grain-
                                                                                  Size Effects in
                                                                                  Reading.
                                                                                  By Pagliuca, G.,
   Words beginning with digraphs                                                 Monaghan, P., 2008
                                                                                  Proc 30th Ann Conf
    were read more accurately                                                     Cog Sci Soc. Mahwah,
                                                                                  NJ:
                                                                                  Lawrence Erlbaum.




    The Debate               Theoretical Models   Contrasting Evidence   Study   Discussion
Behavioural Study:
Pagliuca, Monaghan & McIntosh, 2008

   Same sets of words used in the behavioural study as used in the simulation. 84
    additional filler words selected, each five letters long with different initial bigrams and
    initial letters to the experimental and control stimuli

   Visual noise applied to stimuli from left to right,
    similar to noise applied in simulation study

   Participants completed naming task in which each word was presented for 250ms

   15 university students participated                                             From Exploring Grain-Size
                                                                                    Effects in Reading.

    all native English speakers                                                     By Pagliuca, G., Monaghan,
                                                                                    P., 2008
                                                                                    Proc 30th Ann Conf Cog Sci
                                                                                    Soc. Mahwah, NJ:
                                                                                    Lawrence Erlbaum.

   Words with digraphs were reported more
    accurately than words without, confirming
    predictions made by the model



    The Debate      Theoretical Models      Contrasting Evidence       Study      Discussion
Conclusions:
Pagliuca, Monaghan & McIntosh, 2008

   Modelling:
      –   For digraphs two letter positions contribute to the activation of a single
          phoneme, whereas for non-digraphs each letter only contributes to one
          phoneme’s activation

      –   Graphemes emerge in the course of a system learning the regularities between
          orthographic and phonological representations of words

   Behavioural study:

      –   Indicates computational properties have a profound affect on reading, at least
          under conditions where visual input is impaired

   Different computational properties of the mapping between letters and
    phonemes suggests psycholinguistic effects of words should vary
    according to the compositionality of the mapping

    The Debate      Theoretical Models     Contrasting Evidence       Study      Discussion
Research Aims:

1. Using a computational model of reading based
   on Harm & Seidenberg, 1999.
   Can we extend the digraph effects found in
   Pagliuca, Monaghan & McIntosh to non-
   words?

2. Test predictions raised by model in
   experimental studies.

 The Debate   Theoretical Models   Contrasting Evidence   Study   Discussion
Modelling Study: Design (Model)
   Computational Model:
     –   Based on Harm & Seidenberg 1999

     –   Orthographic Input Layer:
          •   10 letter slots
          •   One of 26 units active in
              each slot to represent letter

     –   Hidden Layer: 100 units                        From Phonology,
                                                        Reading Acquisition,
                                                        and Dyslexia.
                                                        By Harm, M. W.,
     –   Phonological Output Layer:                     Seidenberg, M.S.,
                                                        1999,
          •   8 phoneme slots                           Psychol Rev, 106,
                                                        491-528
          •   Each phoneme represented in terms
              of 25 phonological features

     –   25 Clean-up units
    The Debate      Theoretical Models        Contrasting Evidence             Study   Discussion
Modelling Study: Design (Training)

  –   Training corpus:
       •   6229 monosyllabic words,
       •   Words 1 to 8 letters in length


  –   Training algorithm:
       •   backpropagation learning algorithm (Rumelhart, 1986)


  –   5 million cycles of training, words submitted randomly
      according to frequency

  –   99.9% accuracy following training (tested on training corpus)


 The Debate     Theoretical Models   Contrasting Evidence   Study   Discussion
Modelling Study: Design (Stimuli)
–    Stimuli sets each containing 64 items:
       •   Words with digraph in onset
       •   Non-words with digraph in onset
       •   Control Words
       •   Control Non-words

–    All Words and Non-words 5 letters in length and Monosyllabic
–    Onset pairings matched for same initial letter and similar bigram frequency
–    Controls applied:
       •   Word frequency
       •   Body Friends and Body Enemies
       •   Neighbours
       •   Unigram and Bigram frequency
       •   Partial View Predictability

–    Non-words were formed by switching onsets and rimes within given word set
     (Controls were performed on non-words following formation)
–    Noise applied in three conditions:
       •   No Noise
       •   Uniform 50% reduction in input activation
       •   Decreasing noise condition (replication of Pagliuca, Monaghan & McIntosh, 2008)

    The Debate         Theoretical Models           Contrasting Evidence           Study     Discussion
Modelling Study: Results (Words)

Model performance on word
sets:

  –   Both sets read with 100%
      accuracy before noise
      applied                                                    **


  –   Digraph set read with
      greater accuracy when
      input uniformly impaired
      (t(126) = 2.453, p < 0.01)
                                                                               **
  –   Digraph set read with
      greater accuracy in
      decreasing noise condition
      (t(126) = 4.396, p < 0.01)
                                                                            ** p<0.01
                                                                            * p<0.05

      The Debate     Theoretical Models   Contrasting Evidence    Study   Discussion
Modelling Study: Results (Non-words)

Model performance on non-word
sets:

  –   Accuracy based on comparing
      output to target. Target formed by
      combining phonetic representation
      of onset and rhyme extracted from
      corpus

  –   Lower accuracy in reproduction of                                          **
      digraph set before noise applied

  –   Digraphs read more accurately in
      non-words when input uniformly
      impaired
                                                                                                          ++
      (t(126) = 3.355, p < 0.01)

  –   Non-words containing digraphs
      read more accurately in
      decreasing noise condition
      (t(126) = 2.495, p < 0.01)

                                                      ** p<0.01, * p<0.05, ++ p<0.01 based on error in onset



      The Debate          Theoretical Models   Contrasting Evidence              Study            Discussion
Model Predictions:


 Both words and non-words containing digraphs
 in the initial position will be identified with
 greater accuracy than controls.

  –   For digraphs in both words and non-words two letter
      positions contribute to the activation of a single
      phoneme


 The Debate   Theoretical Models   Contrasting Evidence   Study   Discussion
Behavioural Study: Design (Stimuli)

  –   4 stimuli sets taken from simulation:
            Control Non-words
            Control Words
            Words with digraphs in onset
            Non-words with digraphs in onset

  –   2-dimensional digital pixel noise applied across word
      in decreasing gradient from left to right



   Example of control word with visual noise applied              Example of control non-word with visual noise applied


 The Debate             Theoretical Models             Contrasting Evidence            Study            Discussion
Behavioural Study: Design (Procedure)
   Participants:
     –   15 university students
     –   All native English speakers

   Lexical decision task:
     –   departs from Pagliuca, Monaghan
         & McIntosh, 2008

   Procedure:
     –   Short practice period
     –   Fixation cross presented before stimuli
     –   Stimuli selected at random without replacement
     –   Stimuli presented for 250ms
     –   Response recorded by key press
     –   256 trials completed by participant
    The Debate    Theoretical Models   Contrasting Evidence   Study   Discussion
Behavioural Study: Results (Accuracy)

                                                                Accuracy of Response:
                                               *

                                                                   Words containing
                   **                                               digraphs were
                                                                    responded to more
                                                                    accurately than controls
                                                                    (t(14) = 3.254, p<0.01)

                                                                   Non-words containing
                                                                    digraphs were
                                                                    responded to less
                                                                    accurately than controls
                                                                    (t(14) = 2.457, p<0.05)

 ** p<0.01, * p<0.05


  The Debate            Theoretical Models   Contrasting Evidence     Study     Discussion
Behavioural Study: Results
(Response Times)




 Response Times:
    Similar trends were found in participants reaction times although significance levels were
    not reached

 The Debate        Theoretical Models       Contrasting Evidence        Study        Discussion
Summary

   Modelling:
     –   Greater accuracy reading both words and non-words containing
         digraphs in the initial position in high level noise conditions.

     –   For digraphs two letter positions contributing to activation of single
         phoneme

   Behavioural study:
     –   Words containing digraphs identified with greater accuracy than
         controls when visual noise applied in a decreasing gradient across
         word

     –   Non-words containing digraphs identified with less accuracy than
         controls when visual noise applied in a decreasing gradient across
         word

    The Debate     Theoretical Models   Contrasting Evidence   Study     Discussion
Discussion (1)

   Task differences:
     –   Word naming task:
          •   Pagliuca, Monaghan & McIntosh, 2008
          •   Rastle and Coltheart, 1998
          •   Modelling study


     –   Lexical decision task:
          •   Behavioural study



    The Debate    Theoretical Models   Contrasting Evidence   Study   Discussion
Discussion (2)

   Simulation and Behavioural data showed an advantage
    for words containing digraphs:
     –   Replication of Pagliuca, Monaghan & McIntosh, 2008

     –   Indicates the grain size for reading in English is adaptable
         according to statistics of the letter-sound mapping

     –   Challenges views on independence of letter recognition (Pelli,
         Farrell and Moore, 2003) indicating word perception affected by
         statistics in the language


    The Debate   Theoretical Models   Contrasting Evidence   Study   Discussion
Discussion (3)

   Combined findings:
     –   Single Route (Parallel Processing) Model:
          •   Provides explanation for increased accuracy in identifying
              digraph words displayed by simulation and behavioural
              data
              (Pagliuca, Monaghan & McIntosh, 2008)

          •   Model predicted advantage for reading digraph non-words,
              however behavioural data showed lower accuracy of
              response and slower reaction times


    The Debate     Theoretical Models   Contrasting Evidence   Study   Discussion
Discussion (4)

   Combined findings:
     –   Dual Route (Serial Processing) Model:

          •   Provides explanation for reduced accuracy in digraph non-
              word response
              (Rastle & Coltheart, 1998)


          •   Digraph word advantage not predicted by models lexical
              route



    The Debate    Theoretical Models   Contrasting Evidence   Study   Discussion
Direction of Future Study

   Non-word Naming Task

   Digraphs in final position
     –   If non-lexical route serial this should lead to slower response
         times
         (Rastle & Coltheart, 1998)


   Use similar paradigm to investigate grain-size effects
    in languages with differing grain-size


    The Debate   Theoretical Models   Contrasting Evidence   Study   Discussion
Special Thanks & Acknowledgements



 Experimental Psychology Society


 Lancaster University
Questions:

More Related Content

What's hot

Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis System
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis SystemEvaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis System
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis SystemIJERA Editor
 
dialogue act modeling for automatic tagging and recognition
 dialogue act modeling for automatic tagging and recognition dialogue act modeling for automatic tagging and recognition
dialogue act modeling for automatic tagging and recognitionVipul Munot
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESkevig
 
Rule-based Prosody Calculation for Marathi Text-to-Speech Synthesis
Rule-based Prosody Calculation for Marathi Text-to-Speech SynthesisRule-based Prosody Calculation for Marathi Text-to-Speech Synthesis
Rule-based Prosody Calculation for Marathi Text-to-Speech SynthesisIJERA Editor
 
SYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGE
SYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGESYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGE
SYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGEijnlc
 
PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabili...
PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabili...PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabili...
PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabili...Rommel Carvalho
 
GENETIC APPROACH FOR ARABIC PART OF SPEECH TAGGING
GENETIC APPROACH FOR ARABIC PART OF SPEECH TAGGINGGENETIC APPROACH FOR ARABIC PART OF SPEECH TAGGING
GENETIC APPROACH FOR ARABIC PART OF SPEECH TAGGINGijnlc
 
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...IJECEIAES
 
Suitability of naïve bayesian methods for paragraph level text classification...
Suitability of naïve bayesian methods for paragraph level text classification...Suitability of naïve bayesian methods for paragraph level text classification...
Suitability of naïve bayesian methods for paragraph level text classification...ijaia
 
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...ijtsrd
 
Exploiting rules for resolving ambiguity in marathi language text
Exploiting rules for resolving ambiguity in marathi language textExploiting rules for resolving ambiguity in marathi language text
Exploiting rules for resolving ambiguity in marathi language texteSAT Journals
 
Text level descriptions
Text level descriptionsText level descriptions
Text level descriptionsAndrea Hnatiuk
 

What's hot (14)

Y24168171
Y24168171Y24168171
Y24168171
 
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis System
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis SystemEvaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis System
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis System
 
dialogue act modeling for automatic tagging and recognition
 dialogue act modeling for automatic tagging and recognition dialogue act modeling for automatic tagging and recognition
dialogue act modeling for automatic tagging and recognition
 
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESTHE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
 
Rule-based Prosody Calculation for Marathi Text-to-Speech Synthesis
Rule-based Prosody Calculation for Marathi Text-to-Speech SynthesisRule-based Prosody Calculation for Marathi Text-to-Speech Synthesis
Rule-based Prosody Calculation for Marathi Text-to-Speech Synthesis
 
SYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGE
SYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGESYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGE
SYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGE
 
PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabili...
PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabili...PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabili...
PrOntoLearn: Unsupervised Lexico-Semantic Ontology Generation using Probabili...
 
Fonteneau_etal_15
Fonteneau_etal_15Fonteneau_etal_15
Fonteneau_etal_15
 
GENETIC APPROACH FOR ARABIC PART OF SPEECH TAGGING
GENETIC APPROACH FOR ARABIC PART OF SPEECH TAGGINGGENETIC APPROACH FOR ARABIC PART OF SPEECH TAGGING
GENETIC APPROACH FOR ARABIC PART OF SPEECH TAGGING
 
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
 
Suitability of naïve bayesian methods for paragraph level text classification...
Suitability of naïve bayesian methods for paragraph level text classification...Suitability of naïve bayesian methods for paragraph level text classification...
Suitability of naïve bayesian methods for paragraph level text classification...
 
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
 
Exploiting rules for resolving ambiguity in marathi language text
Exploiting rules for resolving ambiguity in marathi language textExploiting rules for resolving ambiguity in marathi language text
Exploiting rules for resolving ambiguity in marathi language text
 
Text level descriptions
Text level descriptionsText level descriptions
Text level descriptions
 

Similar to What are the functional units in reading? Evidence for statistical variation influencing word processing

Phonaesthemes: A Corpus-based Analysis
Phonaesthemes: A Corpus-based AnalysisPhonaesthemes: A Corpus-based Analysis
Phonaesthemes: A Corpus-based Analysiskotis
 
The Characteristics of DNA Splicing Languages via Yusof-Goode Approach
The Characteristics of DNA Splicing Languages via Yusof-Goode ApproachThe Characteristics of DNA Splicing Languages via Yusof-Goode Approach
The Characteristics of DNA Splicing Languages via Yusof-Goode ApproachAzrin Sunbae
 
Semantic Glimmers: CSDL9
Semantic Glimmers: CSDL9Semantic Glimmers: CSDL9
Semantic Glimmers: CSDL9kotis
 
Cognitive plausibility in learning algorithms
Cognitive plausibility in learning algorithmsCognitive plausibility in learning algorithms
Cognitive plausibility in learning algorithmsAndré Karpištšenko
 
A Neural Probabilistic Language Model.pptx
A Neural Probabilistic Language Model.pptxA Neural Probabilistic Language Model.pptx
A Neural Probabilistic Language Model.pptxRama Irsheidat
 
Corpus linguistics and multi-word units
Corpus linguistics and multi-word unitsCorpus linguistics and multi-word units
Corpus linguistics and multi-word unitsPascual Pérez-Paredes
 
Phonetic Recognition In Words For Persian Text To Speech Systems
Phonetic Recognition In Words For Persian Text To Speech SystemsPhonetic Recognition In Words For Persian Text To Speech Systems
Phonetic Recognition In Words For Persian Text To Speech Systemspaperpublications3
 

Similar to What are the functional units in reading? Evidence for statistical variation influencing word processing (7)

Phonaesthemes: A Corpus-based Analysis
Phonaesthemes: A Corpus-based AnalysisPhonaesthemes: A Corpus-based Analysis
Phonaesthemes: A Corpus-based Analysis
 
The Characteristics of DNA Splicing Languages via Yusof-Goode Approach
The Characteristics of DNA Splicing Languages via Yusof-Goode ApproachThe Characteristics of DNA Splicing Languages via Yusof-Goode Approach
The Characteristics of DNA Splicing Languages via Yusof-Goode Approach
 
Semantic Glimmers: CSDL9
Semantic Glimmers: CSDL9Semantic Glimmers: CSDL9
Semantic Glimmers: CSDL9
 
Cognitive plausibility in learning algorithms
Cognitive plausibility in learning algorithmsCognitive plausibility in learning algorithms
Cognitive plausibility in learning algorithms
 
A Neural Probabilistic Language Model.pptx
A Neural Probabilistic Language Model.pptxA Neural Probabilistic Language Model.pptx
A Neural Probabilistic Language Model.pptx
 
Corpus linguistics and multi-word units
Corpus linguistics and multi-word unitsCorpus linguistics and multi-word units
Corpus linguistics and multi-word units
 
Phonetic Recognition In Words For Persian Text To Speech Systems
Phonetic Recognition In Words For Persian Text To Speech SystemsPhonetic Recognition In Words For Persian Text To Speech Systems
Phonetic Recognition In Words For Persian Text To Speech Systems
 

What are the functional units in reading? Evidence for statistical variation influencing word processing

  • 1. What are the functional units in reading? Evidence for statistical variation influencing reading Alastair Smith Padraic Monaghan
  • 2. The Debate  What information is used to map orthography onto phonology? The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 3. Competing Models  Dual-Route Models – Dual-Route Cascade Model (Coltheart et al, 1993) – Connectionist Dual Process Model (Zorzi et al, 1998) – CDP+ (Perry, Zorzi & Ziegler, 2007)  Single Route Models – Parallel Distributed Processing Model – Seidenberg & McClelland, 1989 – Plaut, McClelland, Seidenberg & Patterson, 1996 – Harm & Seidenberg, 1999 The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 4. Dual-Route Models  Lexical Route  Sub-lexical Route  Serial Processing  Explicit level of representation for graphemes The dual-route cascaded model. From The CDP+ Model of Reading Aloud. By Perry, C., Ziegler, J.C. & Zorzi, M., 2007 Psychological Review, 114, p.275. The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 5. Single Route Models The ‘triangle’ model.  Parallel Processing From Computing the meanings of words in reading. By Harm M.W., Seidenberg M.S., 2004 Psychological Review, 111, p.663.  Encodes statistical relations between patterns of letters and their pronunciation  Single letters provide input The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 6. Psycholinguistic Grain Size Theory (Ziegler & Goswami, 2005)  Type of processing that occurs in reading system determined by statistical relations between orthography and phonology.  Grain sizes: – Language specific – Allow for efficient mapping  Learning to read is learning to find shared grain sizes in orthography and phonology. The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 7. Graphemes  Written representations of phonemes  Can be composed of multiple letters: – Digraphs – Trigraphs From Exploring Grain-Size Effects in Reading. By Pagliuca, G., Monaghan, P., 2008 Proc 30th Ann Conf Cog Sci Soc. Mahwah, NJ: Lawrence Erlbaum. The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 8. Whammies and Double Whammies (Rastle & Coltheart, 1998)  Participants read non-words containing 3 phonemes (e.g. fooce) slower than control non-words containing 5 graphemes (e.g. fruls)  Behavioural study supported by simulation data from dual route model – Non-lexical route processes non-word serially left to right, letter by letter  Conclusions: – Reading system is serial – Functional unit is the letter, not the digraph From Whammies and double whammies. By Rastle, K., Coltheart, M., 1998 Psychon Bull Rev, 5, 277-282 The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 9. Grain-Size Effects in Reading (Pagliuca, Monaghan & McIntosh, 2008)  Findings seem to contradict those of Rastle & Coltheart, 1998  Indicates grain-size adapts according to statistics in the orthography - phonology mapping  Hypothesis: – If graphemes are functional units within the reading system, then a word containing a multi-letter grapheme should be read more accurately than a word without given the same kind of perceptual noise to impair the orthographic input.  Modelling data from single route model supported by behavioural study The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 10. Modelling: Pagliuca, Monaghan & McIntosh, 2008  Single route model based on Harm & Seidenberg, 1999  Orthographic input represented by 8 letter slots  Activation from input letter slots reduced along monotonic gradient from left to right so that the lowest level of activation was in the left most slot – two severities of impairment applied, severe and mild  Model tested on two sets of 62 words, all 5 letters in length and monosyllabic – Set 1: Digraphs in initial position  ch, sh, th – Set 2: Control set (no digraphs)  cr, st, tr From Exploring Grain- Size Effects in Reading. By Pagliuca, G.,  Words beginning with digraphs Monaghan, P., 2008 Proc 30th Ann Conf were read more accurately Cog Sci Soc. Mahwah, NJ: Lawrence Erlbaum. The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 11. Behavioural Study: Pagliuca, Monaghan & McIntosh, 2008  Same sets of words used in the behavioural study as used in the simulation. 84 additional filler words selected, each five letters long with different initial bigrams and initial letters to the experimental and control stimuli  Visual noise applied to stimuli from left to right, similar to noise applied in simulation study  Participants completed naming task in which each word was presented for 250ms  15 university students participated From Exploring Grain-Size Effects in Reading. all native English speakers By Pagliuca, G., Monaghan, P., 2008 Proc 30th Ann Conf Cog Sci Soc. Mahwah, NJ: Lawrence Erlbaum.  Words with digraphs were reported more accurately than words without, confirming predictions made by the model The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 12. Conclusions: Pagliuca, Monaghan & McIntosh, 2008  Modelling: – For digraphs two letter positions contribute to the activation of a single phoneme, whereas for non-digraphs each letter only contributes to one phoneme’s activation – Graphemes emerge in the course of a system learning the regularities between orthographic and phonological representations of words  Behavioural study: – Indicates computational properties have a profound affect on reading, at least under conditions where visual input is impaired  Different computational properties of the mapping between letters and phonemes suggests psycholinguistic effects of words should vary according to the compositionality of the mapping The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 13. Research Aims: 1. Using a computational model of reading based on Harm & Seidenberg, 1999. Can we extend the digraph effects found in Pagliuca, Monaghan & McIntosh to non- words? 2. Test predictions raised by model in experimental studies. The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 14. Modelling Study: Design (Model)  Computational Model: – Based on Harm & Seidenberg 1999 – Orthographic Input Layer: • 10 letter slots • One of 26 units active in each slot to represent letter – Hidden Layer: 100 units From Phonology, Reading Acquisition, and Dyslexia. By Harm, M. W., – Phonological Output Layer: Seidenberg, M.S., 1999, • 8 phoneme slots Psychol Rev, 106, 491-528 • Each phoneme represented in terms of 25 phonological features – 25 Clean-up units The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 15. Modelling Study: Design (Training) – Training corpus: • 6229 monosyllabic words, • Words 1 to 8 letters in length – Training algorithm: • backpropagation learning algorithm (Rumelhart, 1986) – 5 million cycles of training, words submitted randomly according to frequency – 99.9% accuracy following training (tested on training corpus) The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 16. Modelling Study: Design (Stimuli) – Stimuli sets each containing 64 items: • Words with digraph in onset • Non-words with digraph in onset • Control Words • Control Non-words – All Words and Non-words 5 letters in length and Monosyllabic – Onset pairings matched for same initial letter and similar bigram frequency – Controls applied: • Word frequency • Body Friends and Body Enemies • Neighbours • Unigram and Bigram frequency • Partial View Predictability – Non-words were formed by switching onsets and rimes within given word set (Controls were performed on non-words following formation) – Noise applied in three conditions: • No Noise • Uniform 50% reduction in input activation • Decreasing noise condition (replication of Pagliuca, Monaghan & McIntosh, 2008) The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 17. Modelling Study: Results (Words) Model performance on word sets: – Both sets read with 100% accuracy before noise applied ** – Digraph set read with greater accuracy when input uniformly impaired (t(126) = 2.453, p < 0.01) ** – Digraph set read with greater accuracy in decreasing noise condition (t(126) = 4.396, p < 0.01) ** p<0.01 * p<0.05 The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 18. Modelling Study: Results (Non-words) Model performance on non-word sets: – Accuracy based on comparing output to target. Target formed by combining phonetic representation of onset and rhyme extracted from corpus – Lower accuracy in reproduction of ** digraph set before noise applied – Digraphs read more accurately in non-words when input uniformly impaired ++ (t(126) = 3.355, p < 0.01) – Non-words containing digraphs read more accurately in decreasing noise condition (t(126) = 2.495, p < 0.01) ** p<0.01, * p<0.05, ++ p<0.01 based on error in onset The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 19. Model Predictions: Both words and non-words containing digraphs in the initial position will be identified with greater accuracy than controls. – For digraphs in both words and non-words two letter positions contribute to the activation of a single phoneme The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 20. Behavioural Study: Design (Stimuli) – 4 stimuli sets taken from simulation:  Control Non-words  Control Words  Words with digraphs in onset  Non-words with digraphs in onset – 2-dimensional digital pixel noise applied across word in decreasing gradient from left to right Example of control word with visual noise applied Example of control non-word with visual noise applied The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 21. Behavioural Study: Design (Procedure)  Participants: – 15 university students – All native English speakers  Lexical decision task: – departs from Pagliuca, Monaghan & McIntosh, 2008  Procedure: – Short practice period – Fixation cross presented before stimuli – Stimuli selected at random without replacement – Stimuli presented for 250ms – Response recorded by key press – 256 trials completed by participant The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 22. Behavioural Study: Results (Accuracy) Accuracy of Response: *  Words containing ** digraphs were responded to more accurately than controls (t(14) = 3.254, p<0.01)  Non-words containing digraphs were responded to less accurately than controls (t(14) = 2.457, p<0.05) ** p<0.01, * p<0.05 The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 23. Behavioural Study: Results (Response Times) Response Times: Similar trends were found in participants reaction times although significance levels were not reached The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 24. Summary  Modelling: – Greater accuracy reading both words and non-words containing digraphs in the initial position in high level noise conditions. – For digraphs two letter positions contributing to activation of single phoneme  Behavioural study: – Words containing digraphs identified with greater accuracy than controls when visual noise applied in a decreasing gradient across word – Non-words containing digraphs identified with less accuracy than controls when visual noise applied in a decreasing gradient across word The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 25. Discussion (1)  Task differences: – Word naming task: • Pagliuca, Monaghan & McIntosh, 2008 • Rastle and Coltheart, 1998 • Modelling study – Lexical decision task: • Behavioural study The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 26. Discussion (2)  Simulation and Behavioural data showed an advantage for words containing digraphs: – Replication of Pagliuca, Monaghan & McIntosh, 2008 – Indicates the grain size for reading in English is adaptable according to statistics of the letter-sound mapping – Challenges views on independence of letter recognition (Pelli, Farrell and Moore, 2003) indicating word perception affected by statistics in the language The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 27. Discussion (3)  Combined findings: – Single Route (Parallel Processing) Model: • Provides explanation for increased accuracy in identifying digraph words displayed by simulation and behavioural data (Pagliuca, Monaghan & McIntosh, 2008) • Model predicted advantage for reading digraph non-words, however behavioural data showed lower accuracy of response and slower reaction times The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 28. Discussion (4)  Combined findings: – Dual Route (Serial Processing) Model: • Provides explanation for reduced accuracy in digraph non- word response (Rastle & Coltheart, 1998) • Digraph word advantage not predicted by models lexical route The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 29. Direction of Future Study  Non-word Naming Task  Digraphs in final position – If non-lexical route serial this should lead to slower response times (Rastle & Coltheart, 1998)  Use similar paradigm to investigate grain-size effects in languages with differing grain-size The Debate Theoretical Models Contrasting Evidence Study Discussion
  • 30. Special Thanks & Acknowledgements Experimental Psychology Society Lancaster University