Introduction to CHILDES and
           TalkBank

             Brian MacWhinney



CMU - Psychology, Modern Languages, Language
             Technologies Institute
The goal of TalkBank
The core idea
Human communication is a single unified
process.
However, patterns in communication are
analyzed by 20 different fields.
The time scales of the processes varies from
milliseconds to centuries.
But all of these processes must have their
ultimate effect in the Moment.
We can capture the Moment on video.
Principles
Data-sharing, Informed Consent
Multimedia
Open Access, Web Access,
Commentary
Specified Format
Interoperability
Community integration
Availability
http://childes.psy.cmu.edu

http://talkbank.org

programs, manuals, fonts, morphologies, CA
conventions, video production guides, XML
Schema, links to other programs

data can be either downloaded or played
back over the web
Current target areas
1.   CHILDES

2.   PhonBank

3.   BilingualBank

4.   AphasiaBank

5.   CABank

6.   ClassBank
CHILDES
Child Language Data Exchange System

Founded in 1984 in Concord MA

Director: Brian MacWhinney macw@cmu.edu

Programmers: Leonid Spektor, Franklin Chen

3000 Members

130 corpora

Over 3200 published articles
CHILDES and TalkBank
               CHILDES      TalkBank
Age            23 years     7 years
Words          44 million   8 + 55 million
Media          750 GB       450GB
Languages      32           18
Publications   3200+        89
Users          3000+        500
Practical Considerations
 Learning CLAN takes about a week

 Transcription is slow. Perhaps 15:1 ratio.
 Blitzscribe, LENA, etc. probably will not work

 Currently available data may not be perfect
 for a given issue

 Corpora may need enhancement through
 MOR or Coder’s editor




                                                  9
Tools from the Web
Data: 
    
       childes.psy.cmu.edu/data

CLAN: 
 
          childes.psy.cmu.edu/clan

Manuals: 

        childes.psy.cmu.edu/manuals

Morphosyntax: 
 childes.psy.cmu.edu/morgrams

Phon 
     
       childes.psy.cmu.edu/phon

Tutorial videos 
 talkbank.org/training

Digital video: 
   talkbank.org/dv

CA Methods: 
      talkbank.org/CABank
Why no handout?
“Overviews” link has this PPT presentation

CHILDES is now fully electronic. No more paper.




                                                  11
Available Methods
Microanalysis - CA, phonetics, ethology

Microgenetic analysis - CA, code-switching (NEXT)

Group and treatment comparisons - Genesee

Error analysis - YipMatthews

Diffusion analysis - in preschools

Longitudinal studies - growth curves

Modeling - neural nets, dynamic systems,
evolutionary models
CLAN Tools
Transcribing

Editing

Counts -- FREQ, KWAL

Analyses: MOR, GRASP, PHON

Interoperability -- ELAN, Praat, SFS,
EXMARaLDA, CLAPI, PHON
CA
marks
  in
Unicod
   e
Transcripts linked to
Ground Rules
Ethical use, informed consent

Levels of permission

Respect for dignity of participants

Respect for contributors

Requirement to cite sources

Requirement to contribute data



                                      16
Info-CHILDES and
      Membership
Info-childes@googlegroups.com

Archived at LinguistList

Info-CHIBolts for nuts and bolts

Membership list

IASCL Membership




                                   17
Getting Set Up
Download CLAN from Programs link




                                   18
Windows issues
You can work in c:childes

But your administrator may have this locked,
so, you may need shortcuts.

Windows IPA is difficult.

Windows compression may produce .wmf




                                               19
Downloading Manuals
CHAT, CLAN




                        20
Sonic Mode
Esc-0 to start

Highlight area

Shift-click to move
edge

Have cursor on line
in file

S to insert time
marks

Triple click a                36
Transcribing
Open new window (Command-N)

Insert headers
   @Begin
   @Languages: en
   @Participants: CHI Target_Child, MOT Mother, FAT Father, ROS
   Brother
   @Date

F5 with space at each utterance

Go back and transcribe each bullet (c-click)

Adjust time marks using Esc-A
                                                                  37
F5, locate sound, enter
        bullets




                          38
Or use SoundWalker




                     39
Or use the Video Editor




                          40
Conclusions
CHILDES and TalkBank provide solid tools
for studying language learning and
functioning

Data-sharing has led to major advances in
the field

New approaches emphasize the use of
multimedia analysis, computational
linguistics, and speech technology



                                            58

Mac whinney macw

  • 1.
    Introduction to CHILDESand TalkBank Brian MacWhinney CMU - Psychology, Modern Languages, Language Technologies Institute
  • 2.
    The goal ofTalkBank
  • 3.
    The core idea Humancommunication is a single unified process. However, patterns in communication are analyzed by 20 different fields. The time scales of the processes varies from milliseconds to centuries. But all of these processes must have their ultimate effect in the Moment. We can capture the Moment on video.
  • 4.
    Principles Data-sharing, Informed Consent Multimedia OpenAccess, Web Access, Commentary Specified Format Interoperability Community integration
  • 5.
    Availability http://childes.psy.cmu.edu http://talkbank.org programs, manuals, fonts,morphologies, CA conventions, video production guides, XML Schema, links to other programs data can be either downloaded or played back over the web
  • 6.
    Current target areas 1. CHILDES 2. PhonBank 3. BilingualBank 4. AphasiaBank 5. CABank 6. ClassBank
  • 7.
    CHILDES Child Language DataExchange System Founded in 1984 in Concord MA Director: Brian MacWhinney macw@cmu.edu Programmers: Leonid Spektor, Franklin Chen 3000 Members 130 corpora Over 3200 published articles
  • 8.
    CHILDES and TalkBank CHILDES TalkBank Age 23 years 7 years Words 44 million 8 + 55 million Media 750 GB 450GB Languages 32 18 Publications 3200+ 89 Users 3000+ 500
  • 9.
    Practical Considerations LearningCLAN takes about a week Transcription is slow. Perhaps 15:1 ratio. Blitzscribe, LENA, etc. probably will not work Currently available data may not be perfect for a given issue Corpora may need enhancement through MOR or Coder’s editor 9
  • 10.
    Tools from theWeb Data: childes.psy.cmu.edu/data CLAN: childes.psy.cmu.edu/clan Manuals: childes.psy.cmu.edu/manuals Morphosyntax: childes.psy.cmu.edu/morgrams Phon childes.psy.cmu.edu/phon Tutorial videos talkbank.org/training Digital video: talkbank.org/dv CA Methods: talkbank.org/CABank
  • 11.
    Why no handout? “Overviews”link has this PPT presentation CHILDES is now fully electronic. No more paper. 11
  • 12.
    Available Methods Microanalysis -CA, phonetics, ethology Microgenetic analysis - CA, code-switching (NEXT) Group and treatment comparisons - Genesee Error analysis - YipMatthews Diffusion analysis - in preschools Longitudinal studies - growth curves Modeling - neural nets, dynamic systems, evolutionary models
  • 13.
    CLAN Tools Transcribing Editing Counts --FREQ, KWAL Analyses: MOR, GRASP, PHON Interoperability -- ELAN, Praat, SFS, EXMARaLDA, CLAPI, PHON
  • 14.
  • 15.
  • 16.
    Ground Rules Ethical use,informed consent Levels of permission Respect for dignity of participants Respect for contributors Requirement to cite sources Requirement to contribute data 16
  • 17.
    Info-CHILDES and Membership Info-childes@googlegroups.com Archived at LinguistList Info-CHIBolts for nuts and bolts Membership list IASCL Membership 17
  • 18.
    Getting Set Up DownloadCLAN from Programs link 18
  • 19.
    Windows issues You canwork in c:childes But your administrator may have this locked, so, you may need shortcuts. Windows IPA is difficult. Windows compression may produce .wmf 19
  • 20.
  • 21.
    Sonic Mode Esc-0 tostart Highlight area Shift-click to move edge Have cursor on line in file S to insert time marks Triple click a 36
  • 22.
    Transcribing Open new window(Command-N) Insert headers @Begin @Languages: en @Participants: CHI Target_Child, MOT Mother, FAT Father, ROS Brother @Date F5 with space at each utterance Go back and transcribe each bullet (c-click) Adjust time marks using Esc-A 37
  • 23.
    F5, locate sound,enter bullets 38
  • 24.
  • 25.
    Or use theVideo Editor 40
  • 26.
    Conclusions CHILDES and TalkBankprovide solid tools for studying language learning and functioning Data-sharing has led to major advances in the field New approaches emphasize the use of multimedia analysis, computational linguistics, and speech technology 58