Your SlideShare is downloading. ×
PhD Poster
PhD Poster
PhD Poster
PhD Poster
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

PhD Poster

436

Published on

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
436
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Capturing Context for Spoken Corpus Analysis Ronald Carter, Svenja Adolphs and Dawn Knight The University of Nottingham Research funded by ESRC Research Grant RES-149-25-1067 The Digital Records for eSocial Science Project (DReSS II) seeks to allow for the collection and collation of a wider range of heterogeneous datasets for linguistic research, with the aim of facilitating for the investigation of the interface between multiple modes of communication in everyday life. Records of everyday communication, from SMS messages to MMS messages, interaction in virtual environments (instant messaging, entries on personal notice boards etc), GPS data, face-to-face situated discourse, phone calls and video calls, provide the ‘data’ for this research. It is perceived that the systematic analysis of such ‘ubiquitous’ corpora will enable a more detailed investigation of the interface between a variety of different communicative modes from an individual’s perspective, tracking a specific person’s (inter)actions over time (i.e. across an hour, day or even Figure 1: Data ‘domains’. week). Crudely speaking, the key global domains of such data types are as seen in figure 1. We aim to enable the mining/ searching of corpora from a micro level, so according to a specific word, phrase, tag or code, to a more global level, so according to a particular type of media used when recording, a particular physical location, and so on. It would also be beneficial to have the functionalities to search within and across these levels within DRS (the Digital Replay System), software that is being Figure 2: Organising and searching (meta)data developed as part of this project. Therefore DRS is currently being adapted to support the collection, storage, representation and interrogation of a wider range of heterogeneous datasets for linguistic research (see the mock up in figure 3). Corpora of this nature will enable a more detailed investigation of the interface between various different communicative modes; across these different ‘global domains’. For ease of use (i.e. in order to add ‘structure’ to the data/ metadata searches), DRS will enable us to perform searches of the corpora across all ‘roots’ and ‘nodes’, so across particular domains or sub- categories of domains. So, for example, it will search for ‘all participants of age 26’ or ‘all participants called Dawn’ or ‘all participants of age 26 called Dawn’, as seen in figure 2. These roots and nodes should preferably have the flexibility to be user defined, thus should be added and omitted as required, allowing maximum usability. It should also be possible to block and unblock ties in order to compare outputs against Figure 3: Heterogeneous datasets for multi-modal each other. corpora, a DRS-based mock-up for representation.
  • 2. A Multi-modal Corpus Approach to the Analysis of Backchanneling Behaviour Dawn Knight The University of Nottingham Research funded by ESRC Research Grants RES-149-25-0035 & RES-149-25-1067 This ESRC-funded PhD research project sought to outline, and then utilise, a novel multi-modal approach to corpus linguistics. It specifically examined how such an approach can Figure 1: be used to facilitate our explorations of backchanneling Dyadic data in phenomena in conversation, such as gestural (i.e. head nods) the NMMC. and verbal signals of active listenership. Backchannels have been seen as being highly conventionalised, they differ considerably in form, function, interlocutor and location (in context and co-text). Therefore their relevance at any given time in a given conversation is highly conditional. This study sought to analyse the patterned use of specific forms and functions of backchannels within and across sentence boundaries, as evidenced in a five-hour sub-corpus of dyadic multi-modal conversational episodes, taken from the Nottingham Multi-Modal Corpus, as depicted in figure 1 (constructed as part of the Digital Records for eSocial Science Project). This was conducted using the multi-modal concordance tool, a feature of the DRS (Digital Replay System) workbench (figure 1). Figure 2: The DRS concordancer. The results suggested that a close relationship exists between the function of a nod movement and its use in relation to spoken forms. That is, whether or not it co-occurs with spoken backchannels, and on the particular form etc. of this lexical unit. Although these patterns are not necessarily counter- intuitive, it was not possible to support such claims when using traditional mono-modal corpora. So while the previous literature has, in passing, made reference to such patterns, these have never been extensively investigated in the way that the current study has done. These functions exist in the form of a cline, in effect from the most minimal to the more engaged forms of non-verbal backchannels (akin to Information Receipt Tokens and Engaged Response Tokens; IR and ER in figure 3). While, for example, nods used in isolation are general the most minimal, least imposing, forms of backchannels, this is perhaps more true for nods low in intensity (akin to Continuers and Convergence Tokens; CON and CNV in figure 3). More intense forms may act more emphatically, so function in a more engaged way insofar as they are likely to be more noticeable to the speaker, and may act as Figure 3: A novel coding matrix for spoken and providing feedback, rather than merely maintaining non-verbal backchannels in discourse. the flow of talk.
  • 3. The Construction and Use of Multi‐Modal Linguistic Corpora Dawn Knight, The University of Nottingham Research funded by ESRC Research Grants RES‐149‐25‐0035 & RES‐149‐25‐1067 PhD Research Work as a Research Fellow This  ESRC‐funded  PhD  research  project  sought  to  The  Digital  Records  for  eSocial  Science  Project  (DReSS  outline, and then utilise, a novel multi‐modal approach  II)  seeks  to  allow  for  the  collection  and  collation  of  a  to  corpus  linguistics.  It  specifically  examined  how  such  wider  range  of  heterogeneous  datasets  for  linguistic  an  approach  can  be  used  to  facilitate  our  explorations  research,  with  the  aim  of  facilitating  for  the  of backchanneling phenomena in conversation, such as  investigation  of  the  interface  between  multiple  modes  gestural  (i.e.  head  nods)  and  verbal  signals  of  active  of communication in everyday life.  listenership.  Backchannels  have  been  seen  as  being  highly  conventionalised,  they  differ  considerably  in  Records  of  everyday  communication,  from  SMS  form, function, interlocutor and location (in context and  messages  to  MMS  messages,  interaction  in  virtual  co‐text). Therefore their relevance at any given time in a  environments  (instant  messaging,  entries  on  personal  given conversation is highly conditional.  notice  boards  etc),  GPS  data,  face‐to‐face  situated  discourse,  phone  calls  and  video  calls,  provide  the  This  study  sought  to  analyse  the  patterned  use  of  ‘data’ for  this  research.  It  is  perceived  that  the  specific forms and functions of backchannels within and  systematic  analysis  of  such  ‘ubiquitous’ corpora  will  across sentence boundaries, as evidenced in a five‐hour  enable  a  more  detailed  investigation  of  the  interface  sub‐corpus  of  dyadic  conversation,  taken  from  the  between  a  variety  of  different  communicative  modes  Nottingham Multi‐Modal Corpus (constructed as part of  from  an  individual’s  perspective,  tracking  a  specific  the Digital Records for eSocial Science Project). This was  person’s  (inter)actions  over  time  (i.e.  across  an  hour,  undertaken  using  the  multi‐modal  concordance  tool,  a  day or even week). feature of the DRS (Digital Replay System) (figure 1). We  aim  to  enable  the  mining/  searching  of  corpora  from  a  micro  level,  so  according  to  a  specific  word,  phrase, tag or code, to a more global level, so according  to  a  particular  type  of  media  used  when  recording,  a  particular physical location, and so on. It would also be  beneficial  to  have  the  functionalities  to  search  within  and across these levels, something that DRS is currently  being  adapted  to  enable  users  to  do  (see  mock  up  in  figure 2): Figure 1: The DRS concordancer. The  results  suggested  that  a  close  relationship  exists  between the function of a nod movement and its use in  relation to spoken forms. That is, whether or not it co‐ occurs with spoken backchannels, and on the particular  form etc. of this lexical unit.   Although  these  patterns  are  not  necessarily  counter‐ intuitive,  it  was  not  possible  to  support  such  claims  when  using  traditional  mono‐modal  corpora.  So  while  the  previous  literature  has,  in  passing,  made  reference  to  such  patterns,  these  have  never  been  extensively  investigated in the way that the current study has done.  Figure 2: Heterogeneous datasets for multi‐modal  corpora, a DRS‐based mock‐up for representation.
  • 4. The Construction and Use of Multi‐Modal Linguistic Corpora Dawn Knight, The University of Nottingham Research funded by ESRC Research Grants RES‐149‐25‐0035 & RES‐149‐25‐1067 PhD Research Work as a Research Fellow This  ESRC‐funded  PhD  research  project  sought  to  The  Digital  Records  for  eSocial Science  Project  (DReSS outline, and then utilise, a novel multi‐modal approach  II)  seeks  to  allow  for  the  collection  and  collation  of  a  to  corpus  linguistics.  It  specifically  examined  how  such  wider  range  of  heterogeneous  datasets  for  linguistic  an  approach  can  be  used  to  facilitate  our  explorations  research,  with  the  aim  of  facilitating  for  the  of backchanneling phenomena in conversation, such as  investigation  of  the  interface  between  multiple  modes  gestural (i.e.  head  nods)  and  verbal  signals  of  active  of communication in everyday life.  listenership.  Backchannels  have  been  seen  as  being  highly  conventionalised,  they  differ  considerably  in  Records  of  everyday  communication,  from  SMS  form, function, interlocutor and location (in context and  messages  to  MMS  messages,  interaction  in  virtual  co‐text). Therefore their relevance at any given time in a  environments  (instant  messaging,  entries  on  personal  given conversation is highly conditional.  notice  boards  etc),  GPS  data,  face‐to‐face  situated  discourse,  phone  calls  and  video  calls,  provide  the  This  study  sought  to  analyse  the  patterned  use  of  ‘data’ for  this  research.  It  is  perceived  that  the  specific forms and functions of backchannels within and  systematic  analysis  of  such  ‘ubiquitous’ corpora  will  across sentence boundaries, as evidenced in a five‐hour  enable  a  more  detailed  investigation  of  the  interface  sub‐corpus  of  dyadic  conversation,  taken  from  the  between  a  variety  of  different  communicative  modes  Nottingham Multi‐Modal Corpus (constructed as part of  from  an  individual’s  perspective,  tracking  a  specific  the Digital Records for eSocial Science Project). This was  person’s  (inter)actions over  time  (i.e.  across  an  hour,  undertaken  using  the  multi‐modal  concordance  tool,  a  day or even week). feature of the DRS (Digital Replay System) (figure 1). Figure 1: The DRS concordancer. The  results  suggested  that  a  close  relationship  exists  between the function of a nod movement and its use in  relation to spoken forms. That is, whether or not it co‐ occurs with spoken backchannels, and on the particular  form etc. of this lexical unit.   Figure 2: Heterogeneous datasets for multi‐modal  corpora, a DRS‐based mock‐up for representation. Although  these  patterns  are  not  necessarily  counter‐ intuitive,  it  was  not  possible  to  support  such  claims  when  using  traditional  mono‐modal  corpora.  So  while  We  aim  to  enable  the  mining/  searching  of  corpora  the  previous  literature  has,  in  passing,  made  reference  from  a  micro  level,  so  according  to  a  specific  word,  to  such  patterns,  these  have  never  been  extensively  phrase, tag or code, to a more global level, so according  investigated in the way that the current study has done.  to  a  particular  type  of  media  used  when  recording,  a  particular physical location, and so on. It would also be  beneficial  to  have  the  functionalities  to  search  within  and across these levels, something that DRS is currently  being adapted to enable users to do (as depicted in the  mock up in figure 2).

×