*
    Nick Campbell
    Speech Communication Lab
    Trinity College Dublin, Ireland
*
    * TCD – Stokes Professor (Dublin)
    * CNGL – PI – Delivery & Interaction
    * ELRA – board member / VP – speech
    * ISCA – board member – workshops
    * IEEE – Sig Proc Soc - SLTC member
    * ATR/NiCT – research director(Japan)
    * Speech Prosody 2014 (Dublin) host

        * Speech scientist/researcher/corpus analyst
* AT&T Bell Labs
    * The ideas people – think ‘BIG’

* IBM UK Scientific Centre
    * The corpus people – ‘collect it all’

* ATR basic telecom research
    * The fundamentals - learn how to ‘infer’ from it


*
* we used to be considered BIG – speech data
  (and now multimedia) gobbled up memory
* I collected 1500 hours of everyday chat/daily
  conversations in 2000 – (@1GB per minute) -
  took 5-years to process!

* now Apple, Google, Ms, .. get that each minute
       (but the secret is in the metadata)

* we need accessible data & tools for everybody!

   *
* but we need to manage privacy issues first!




  *
* and we need a way to protect IP as well

* written publications have ISBN standard
* work is now underway (cf ELRA & COCOSDA) to
  institute ISLRN for Language Resources
* researchers need to get credit for corpora as
  well as for publishing research results
* The community needs a way to identify,
  acknowledge, attribute, and reference data



 *
* tools for processing speech & multimodal data

* htk, hts, R, etc . . .   not simple to use


* little consensus on what features to encode

* manual bootstrap – much too time-consuming!


*
* social interaction

* personal idiosyncracies

* group dynamics – multimodal data (TB/hr)

* issues of robustness / domain specificity /
 privacy / storage & archiving / redistribution


     *
context analytics:


* cultural and language-specific needs
* multimodal – multimedia – multilingual
* tools for ‘less-well-supported’ languages

* e.g., U-STAR consortium for speech research –
 sharing tools & data & knowledge for research



     *
* European Language Resources Association
* COCOSDA – int’l coordinating committee
* IEEE SLTC, ISCA SIGS, there are places to go

    * but are they ready for really BIG data?
               perhaps not yet . . .




                          *
* curricula prepare people

* what standards to rely on?
* what resources available?
* what features to extract?
* what tools to work with?
* what use to put it to?
* what info to hide?
* what to do next?

                               *
*

Speech Technology and Big Data

  • 1.
    * Nick Campbell Speech Communication Lab Trinity College Dublin, Ireland
  • 2.
    * * TCD – Stokes Professor (Dublin) * CNGL – PI – Delivery & Interaction * ELRA – board member / VP – speech * ISCA – board member – workshops * IEEE – Sig Proc Soc - SLTC member * ATR/NiCT – research director(Japan) * Speech Prosody 2014 (Dublin) host * Speech scientist/researcher/corpus analyst
  • 3.
    * AT&T BellLabs * The ideas people – think ‘BIG’ * IBM UK Scientific Centre * The corpus people – ‘collect it all’ * ATR basic telecom research * The fundamentals - learn how to ‘infer’ from it *
  • 4.
    * we usedto be considered BIG – speech data (and now multimedia) gobbled up memory * I collected 1500 hours of everyday chat/daily conversations in 2000 – (@1GB per minute) - took 5-years to process! * now Apple, Google, Ms, .. get that each minute (but the secret is in the metadata) * we need accessible data & tools for everybody! *
  • 5.
    * but weneed to manage privacy issues first! *
  • 6.
    * and weneed a way to protect IP as well * written publications have ISBN standard * work is now underway (cf ELRA & COCOSDA) to institute ISLRN for Language Resources * researchers need to get credit for corpora as well as for publishing research results * The community needs a way to identify, acknowledge, attribute, and reference data *
  • 7.
    * tools forprocessing speech & multimodal data * htk, hts, R, etc . . . not simple to use * little consensus on what features to encode * manual bootstrap – much too time-consuming! *
  • 8.
    * social interaction *personal idiosyncracies * group dynamics – multimodal data (TB/hr) * issues of robustness / domain specificity / privacy / storage & archiving / redistribution *
  • 9.
    context analytics: * culturaland language-specific needs * multimodal – multimedia – multilingual * tools for ‘less-well-supported’ languages * e.g., U-STAR consortium for speech research – sharing tools & data & knowledge for research *
  • 10.
    * European LanguageResources Association * COCOSDA – int’l coordinating committee * IEEE SLTC, ISCA SIGS, there are places to go * but are they ready for really BIG data? perhaps not yet . . . *
  • 11.
    * curricula preparepeople * what standards to rely on? * what resources available? * what features to extract? * what tools to work with? * what use to put it to? * what info to hide? * what to do next? *
  • 12.