SlideShare a Scribd company logo
LET THE COMPUTER
DO THE WORK
Karen Cariani & Casey Davis
WGBH | Boston, MA, USA
the situation
■ 68,000 digitized television and radio programs
■ incomplete, inaccurate metadata records
■ limited staff resources
■ we need to know what we have in the collection
■ we have a responsibility to users to provide access to the collection
■ continued growth of the collection (content and sparse metadata)
SPEECH-TO-TEXT
TRANSCRIPTION
GAME
AUDIO WAVEFORM
ANALYSIS
The State of Recorded Sound
Preservation in the United States: A
National Legacy at Risk in the Digital Age
(2010)
Suggested that if scholars and students do not use sound archives,
cultural heritage institutions will be less inclined to preserve them.
Archives and libraries must collaborate with patrons and scholars to
understand how recordings are and might be used.
Scholars need to know what kinds of analysis are possible in an age
of large, freely available collections and advanced computational
analysis.
State of the art
A vision
“ . . . the sound file would become . . . a text
for study, much like the visual document.
The acoustic experience of listening to the
poem would begin to compete with the
visual experience of reading the poem.”
Bernstein, Charles. Attack of the Difficult Poems: Essays and Inventions.
University Of Chicago Press, 2011, 114.
http://www.hipstas.org
HiPSTAS team
• Tanya Clement, [PI] Assistant Professor, University of
Texas at Austin
• Loretta Auvil [Co-PI] Senior Project Coordinator at the
Illinois Informatics Institute (I3) at the University of
Illinois at Urbana-Champaign
• David Tcheng [Co-PI] Research Scientist at I3; ARLO
developer
• Tony Borries, Research Programmer working as a
consultant with I3; ARLO programmer
• David Enstrom, Biologist, University of Illinois at
Urbana-Champaign; consultant
Participants, Hipstas Institute, 2013-
2014
• 8 librarians and archivists
• 9 humanities scholars
• 3 advanced graduate students in humanities and
information science
Participating collections
• poetry from PennSound at the University of
Pennsylvania 30,000 audio files
• folklore at the Dolph the Briscoe Center for American
History at UT Austin, 57 feet of tapes (reels and
audiocassettes)
• storytelling traditions at the Native American Projects
(NAP) at the American Philosophical Society in
Philadelphia , 50 tribes, 3,000 hours
• Field recordings (200,000 recordings) American Folklife
Center, Library of Congress
• 30, 000 hours, Oral histories, Storycorps
• Speeches in the Southern Christian Leadership
Conference recordings, Emory University
• 700 recordings in the Elliston Poetry Collection at the
University of Cincinnati
• 36 interviews in the Dust, Drought and Dreams Gone
Dry: Oklahoma Women and the Dust Bowl (WDB) oral
history project out of the Oklahoma State Libraries
OTHER COLLECTIONS OF INTEREST TO PARTICIPANTS
To develop a virtual research environment in which users
can better access and analyze spoken word collections of
interest to humanists through:
1. an assessment of scholarly requirements for analyzing
sound
2. an assessment of technological infrastructures needed
to support discovery
3. preliminary tests that demonstrate the efficacy of using
such tools in humanities scholarship
4. A freely available, open-source, API-driven version for
general use
HIPSTAS: PRIMARY GOALS
ARLO (Adaptive Recognition with
Layered Optimization)
HZ, a unit
of
frequency
Time
a heat based color scheme.
White – hottest, most
intense
Yellow
Red
Green
Blue
Black – coolest, least
intense
Energy represented by
OpenMary
LaynorStein
Searching for Sound with
Sound
Supervised Classification
UNSUPERVISED CLASSIFICATION
Searching for Sound with
Get results
Visualize Results
VISUALIZE RESULTS
VISUALIZE RESULTS
Blue = sung; green = spoken; red = instrumental
55 John Alan Lomax recordings 1926-1941
Visualize results
Visualize results
55 John Alan Lomax recordings 1926-1941
Takeaways:
■ What do scholars talk about when they talk about
sound?
• Language dynamics: tempo, pitch, tone/timbre,
volume, pace, laughter, silence, applause, moans,
screams, dialects, changing speakers, gender,
age, changing genres
• Environment: fan hums, car horns, chickens, train
whistles, bird calls, frogs mating
• Materiality: recording noises, needle drops,
feedback, the electronic grid, changing tracks
■What do engineers talk about they talk about
audio?
• Resolution: Bit depth, Bit rate, sample rate
• Signal processing: Fast Fourier Transform (FFT)
and filter banks
• Dynamics: Damping ratios, gain, frequencies,
spectra, energy, and pitch energy
TAKEAWAYS
■ What do computer scientists talk about when
they talk about ML?
• Features: What are we measuring?
• Ground Truth: What’s the answer? How do we
know when we’re accurate?
• Optimization: Accuracy vs. Efficiency – how do
you balance the accuracy of your results
against the computational resources you need
to achieve that level of accuracy?
Takeaways
Takeaways
• Literacy: How much do we need to know about the
technology of audio, of computational methods, and of
humanist inquiry to do new kinds of research in this area?
• Usability: What kinds of interfaces and tools facilitate AV
analysis in a diverse range of disciplines and communities?
Who gets access to these tools and for what kinds of
questions?
• Accuracy: Is good enough, good enough?
• Scalability: How much storage and processing power do
users need to conduct local and large-scale AV analyses? A
Laptop? A Supercomputer?
• Sustainability: What are local, national, and global scale
issues? How does this work fit back into the access
infrastructure already in place in archives, libraries,
classrooms? Is data enough to get us over the hump of our
limited means for discovery?
NATURAL LANGUAGE
PROCESSING TOOLS
Computational tools
■ Language
■ Speech to text
■ Image recognition
■ Sound
Data visualization
■ ARLO
■ Hipsta
We will want to show sample files
■ Popup archive
■ Speech to text
■ Games to correct
Visualizations
LAPPS grid
■ Tools listed
americanarchive.org
@amarchivepub
facebook.com/amarchivepub

More Related Content

Similar to Let the Computer Do the Work

"It will discourse most eloquent music": Sonify Variants of Hamlet
"It will discourse most eloquent music": Sonify Variants of Hamlet"It will discourse most eloquent music": Sonify Variants of Hamlet
"It will discourse most eloquent music": Sonify Variants of Hamlet
Iain Emsley
 
Improving Access to Historic Public Broadcasting through Speech-to-Text, Crow...
Improving Access to Historic Public Broadcasting through Speech-to-Text, Crow...Improving Access to Historic Public Broadcasting through Speech-to-Text, Crow...
Improving Access to Historic Public Broadcasting through Speech-to-Text, Crow...
WGBH Media Library and Archives
 
Music Objects to Social Machines
Music Objects to Social MachinesMusic Objects to Social Machines
Music Objects to Social Machines
David De Roure
 
Sound Matters: a framework for the creative use and re-use of sound: field re...
Sound Matters: a framework for the creative use and re-use of sound: field re...Sound Matters: a framework for the creative use and re-use of sound: field re...
Sound Matters: a framework for the creative use and re-use of sound: field re...
Jisc
 
Academic Libraries In The 21st Century
Academic Libraries In The 21st CenturyAcademic Libraries In The 21st Century
Academic Libraries In The 21st Century
Wilma Jones
 
The convergence of "hard" and "soft"in music technology, Rolf Inge Godøy, UiO
The convergence of "hard" and "soft"in music technology, Rolf Inge Godøy, UiOThe convergence of "hard" and "soft"in music technology, Rolf Inge Godøy, UiO
The convergence of "hard" and "soft"in music technology, Rolf Inge Godøy, UiO
The Research Council of Norway, IKTPLUSS
 
Digital Humanities Venice Group Presentation - Opening the Libro d'Oro
Digital Humanities Venice Group Presentation - Opening the Libro d'OroDigital Humanities Venice Group Presentation - Opening the Libro d'Oro
Digital Humanities Venice Group Presentation - Opening the Libro d'OroMichael Mitchell
 
Linked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchLinked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities research
Enrico Daga
 
Estermann performing arts_database_20180721
Estermann performing arts_database_20180721Estermann performing arts_database_20180721
Estermann performing arts_database_20180721
Beat Estermann
 
Pratt SILS Cultural Heritage: Description and Access Spring 2011
Pratt SILS Cultural Heritage: Description and Access Spring 2011Pratt SILS Cultural Heritage: Description and Access Spring 2011
Pratt SILS Cultural Heritage: Description and Access Spring 2011PrattSILS
 
Digital Classicist London Seminars 2013 - Seminar 5 - Dot Porter
Digital Classicist London Seminars 2013 - Seminar 5 - Dot PorterDigital Classicist London Seminars 2013 - Seminar 5 - Dot Porter
Digital Classicist London Seminars 2013 - Seminar 5 - Dot Porter
DigitalClassicistLondon
 
Big Data Case Studies
Big Data Case Studies Big Data Case Studies
Big Data Case Studies
UIResearchPark
 
Boston Library Consortium Webinars: Use of AAPB in Humanities Research"
Boston Library Consortium Webinars: Use of AAPB in Humanities Research"Boston Library Consortium Webinars: Use of AAPB in Humanities Research"
Boston Library Consortium Webinars: Use of AAPB in Humanities Research"
Ryn Marchese
 
What's the Point Of Digitisation: Measuring Use and Impact
What's the Point Of Digitisation: Measuring Use and ImpactWhat's the Point Of Digitisation: Measuring Use and Impact
What's the Point Of Digitisation: Measuring Use and Impact
Alastair Dunning
 
Parker musiclib401
Parker musiclib401Parker musiclib401
Parker musiclib401japarker12
 
Library Futures & the Importance of Understanding Communities of Users
Library Futures & the Importance of Understanding Communities of UsersLibrary Futures & the Importance of Understanding Communities of Users
Library Futures & the Importance of Understanding Communities of Users
Christine Madsen
 
topics natural language processing and image processing
topics natural language processing and image processingtopics natural language processing and image processing
topics natural language processing and image processing
youkayaslam
 
way_topics.ppt
way_topics.pptway_topics.ppt
way_topics.ppt
UmayKulsoom2
 
2013 RBMS Premodern manuscript application profile presentation
2013 RBMS Premodern manuscript application profile presentation2013 RBMS Premodern manuscript application profile presentation
2013 RBMS Premodern manuscript application profile presentation
ssteuer
 

Similar to Let the Computer Do the Work (20)

"It will discourse most eloquent music": Sonify Variants of Hamlet
"It will discourse most eloquent music": Sonify Variants of Hamlet"It will discourse most eloquent music": Sonify Variants of Hamlet
"It will discourse most eloquent music": Sonify Variants of Hamlet
 
Improving Access to Historic Public Broadcasting through Speech-to-Text, Crow...
Improving Access to Historic Public Broadcasting through Speech-to-Text, Crow...Improving Access to Historic Public Broadcasting through Speech-to-Text, Crow...
Improving Access to Historic Public Broadcasting through Speech-to-Text, Crow...
 
Music Objects to Social Machines
Music Objects to Social MachinesMusic Objects to Social Machines
Music Objects to Social Machines
 
Sound Matters: a framework for the creative use and re-use of sound: field re...
Sound Matters: a framework for the creative use and re-use of sound: field re...Sound Matters: a framework for the creative use and re-use of sound: field re...
Sound Matters: a framework for the creative use and re-use of sound: field re...
 
Academic Libraries In The 21st Century
Academic Libraries In The 21st CenturyAcademic Libraries In The 21st Century
Academic Libraries In The 21st Century
 
The convergence of "hard" and "soft"in music technology, Rolf Inge Godøy, UiO
The convergence of "hard" and "soft"in music technology, Rolf Inge Godøy, UiOThe convergence of "hard" and "soft"in music technology, Rolf Inge Godøy, UiO
The convergence of "hard" and "soft"in music technology, Rolf Inge Godøy, UiO
 
Digital Humanities Venice Group Presentation - Opening the Libro d'Oro
Digital Humanities Venice Group Presentation - Opening the Libro d'OroDigital Humanities Venice Group Presentation - Opening the Libro d'Oro
Digital Humanities Venice Group Presentation - Opening the Libro d'Oro
 
Linked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities researchLinked data for knowledge curation in humanities research
Linked data for knowledge curation in humanities research
 
Estermann performing arts_database_20180721
Estermann performing arts_database_20180721Estermann performing arts_database_20180721
Estermann performing arts_database_20180721
 
Pratt SILS Cultural Heritage: Description and Access Spring 2011
Pratt SILS Cultural Heritage: Description and Access Spring 2011Pratt SILS Cultural Heritage: Description and Access Spring 2011
Pratt SILS Cultural Heritage: Description and Access Spring 2011
 
Digital Classicist London Seminars 2013 - Seminar 5 - Dot Porter
Digital Classicist London Seminars 2013 - Seminar 5 - Dot PorterDigital Classicist London Seminars 2013 - Seminar 5 - Dot Porter
Digital Classicist London Seminars 2013 - Seminar 5 - Dot Porter
 
Big Data Case Studies
Big Data Case Studies Big Data Case Studies
Big Data Case Studies
 
Boston Library Consortium Webinars: Use of AAPB in Humanities Research"
Boston Library Consortium Webinars: Use of AAPB in Humanities Research"Boston Library Consortium Webinars: Use of AAPB in Humanities Research"
Boston Library Consortium Webinars: Use of AAPB in Humanities Research"
 
What's the Point Of Digitisation: Measuring Use and Impact
What's the Point Of Digitisation: Measuring Use and ImpactWhat's the Point Of Digitisation: Measuring Use and Impact
What's the Point Of Digitisation: Measuring Use and Impact
 
Parker musiclib401
Parker musiclib401Parker musiclib401
Parker musiclib401
 
Library Futures & the Importance of Understanding Communities of Users
Library Futures & the Importance of Understanding Communities of UsersLibrary Futures & the Importance of Understanding Communities of Users
Library Futures & the Importance of Understanding Communities of Users
 
topics natural language processing and image processing
topics natural language processing and image processingtopics natural language processing and image processing
topics natural language processing and image processing
 
way_topics.ppt
way_topics.pptway_topics.ppt
way_topics.ppt
 
2013 RBMS Premodern manuscript application profile presentation
2013 RBMS Premodern manuscript application profile presentation2013 RBMS Premodern manuscript application profile presentation
2013 RBMS Premodern manuscript application profile presentation
 
69 kuta
69 kuta69 kuta
69 kuta
 

More from WGBH Media Library and Archives

Engage Your Community to Celebrate Your History
Engage Your Community to Celebrate Your HistoryEngage Your Community to Celebrate Your History
Engage Your Community to Celebrate Your History
WGBH Media Library and Archives
 
Wikipedia Editathon: How to Guide
Wikipedia Editathon: How to GuideWikipedia Editathon: How to Guide
Wikipedia Editathon: How to Guide
WGBH Media Library and Archives
 
FIX IT+ Transcript Editing
FIX IT+ Transcript EditingFIX IT+ Transcript Editing
FIX IT+ Transcript Editing
WGBH Media Library and Archives
 
Press Play on History: Unlocking 70 Years of Primary Source Materials for Dis...
Press Play on History: Unlocking 70 Years of Primary Source Materials for Dis...Press Play on History: Unlocking 70 Years of Primary Source Materials for Dis...
Press Play on History: Unlocking 70 Years of Primary Source Materials for Dis...
WGBH Media Library and Archives
 
AV Digitization Projects: Tools and Strategies for Enhancing Impact and Engag...
AV Digitization Projects: Tools and Strategies for Enhancing Impact and Engag...AV Digitization Projects: Tools and Strategies for Enhancing Impact and Engag...
AV Digitization Projects: Tools and Strategies for Enhancing Impact and Engag...
WGBH Media Library and Archives
 
Implementing Samvera Open Source Technology at WGBH and the American Archive ...
Implementing Samvera Open Source Technology at WGBH and the American Archive ...Implementing Samvera Open Source Technology at WGBH and the American Archive ...
Implementing Samvera Open Source Technology at WGBH and the American Archive ...
WGBH Media Library and Archives
 
Use of American Archive of Public Broadcasting in Humanities Research
Use of American Archive of Public Broadcasting in Humanities ResearchUse of American Archive of Public Broadcasting in Humanities Research
Use of American Archive of Public Broadcasting in Humanities Research
WGBH Media Library and Archives
 
American Archive of Public Broadcasting: a Digital Library for Teaching Media...
American Archive of Public Broadcasting: a Digital Library for Teaching Media...American Archive of Public Broadcasting: a Digital Library for Teaching Media...
American Archive of Public Broadcasting: a Digital Library for Teaching Media...
WGBH Media Library and Archives
 
Accessibility of the American Archive of Public Broadcasting in Academic Libr...
Accessibility of the American Archive of Public Broadcasting in Academic Libr...Accessibility of the American Archive of Public Broadcasting in Academic Libr...
Accessibility of the American Archive of Public Broadcasting in Academic Libr...
WGBH Media Library and Archives
 
How to Use the American Archive of Public Broadcasting as a Resource in the C...
How to Use the American Archive of Public Broadcasting as a Resource in the C...How to Use the American Archive of Public Broadcasting as a Resource in the C...
How to Use the American Archive of Public Broadcasting as a Resource in the C...
WGBH Media Library and Archives
 
Putting the Pieces Together: Creating a National Educational Television Catalog
Putting the Pieces Together: Creating a National Educational Television CatalogPutting the Pieces Together: Creating a National Educational Television Catalog
Putting the Pieces Together: Creating a National Educational Television Catalog
WGBH Media Library and Archives
 
DESIGN FOR CONTEXT: Cataloging, Web Design, and Linked Data for Exposing Nati...
DESIGN FOR CONTEXT: Cataloging, Web Design, and Linked Data for Exposing Nati...DESIGN FOR CONTEXT: Cataloging, Web Design, and Linked Data for Exposing Nati...
DESIGN FOR CONTEXT: Cataloging, Web Design, and Linked Data for Exposing Nati...
WGBH Media Library and Archives
 
DESIGN FOR CONTEXT: Cataloging and Linked Data for Exposing National Educatio...
DESIGN FOR CONTEXT: Cataloging and Linked Data for Exposing National Educatio...DESIGN FOR CONTEXT: Cataloging and Linked Data for Exposing National Educatio...
DESIGN FOR CONTEXT: Cataloging and Linked Data for Exposing National Educatio...
WGBH Media Library and Archives
 
Preserving Your Station Legacy with the American Archive of Public Broadcasti...
Preserving Your Station Legacy with the American Archive of Public Broadcasti...Preserving Your Station Legacy with the American Archive of Public Broadcasti...
Preserving Your Station Legacy with the American Archive of Public Broadcasti...
WGBH Media Library and Archives
 
FIX IT - A Transcript Game to Make Historic Public Broadcasting More Discover...
FIX IT - A Transcript Game to Make Historic Public Broadcasting More Discover...FIX IT - A Transcript Game to Make Historic Public Broadcasting More Discover...
FIX IT - A Transcript Game to Make Historic Public Broadcasting More Discover...
WGBH Media Library and Archives
 
Using Computational Tools and Crowdsourcing Games to Increase Metadata and Di...
Using Computational Tools and Crowdsourcing Games to Increase Metadata and Di...Using Computational Tools and Crowdsourcing Games to Increase Metadata and Di...
Using Computational Tools and Crowdsourcing Games to Increase Metadata and Di...
WGBH Media Library and Archives
 
Can the Computer and the Public Do the Metadata Work?
Can the Computer and the Public Do the Metadata Work?Can the Computer and the Public Do the Metadata Work?
Can the Computer and the Public Do the Metadata Work?
WGBH Media Library and Archives
 
Going Far by Going Together: Collaboration with Scholars and Other Allies
Going Far by Going Together: Collaboration with Scholars and Other AlliesGoing Far by Going Together: Collaboration with Scholars and Other Allies
Going Far by Going Together: Collaboration with Scholars and Other Allies
WGBH Media Library and Archives
 
Building AAPB Participation into Digitization Grant Proposals: Requirements, ...
Building AAPB Participation into Digitization Grant Proposals: Requirements, ...Building AAPB Participation into Digitization Grant Proposals: Requirements, ...
Building AAPB Participation into Digitization Grant Proposals: Requirements, ...
WGBH Media Library and Archives
 
Building the AAPB: Inter-Institutional Preservation and Access Workflows
Building the AAPB: Inter-Institutional Preservation and Access WorkflowsBuilding the AAPB: Inter-Institutional Preservation and Access Workflows
Building the AAPB: Inter-Institutional Preservation and Access Workflows
WGBH Media Library and Archives
 

More from WGBH Media Library and Archives (20)

Engage Your Community to Celebrate Your History
Engage Your Community to Celebrate Your HistoryEngage Your Community to Celebrate Your History
Engage Your Community to Celebrate Your History
 
Wikipedia Editathon: How to Guide
Wikipedia Editathon: How to GuideWikipedia Editathon: How to Guide
Wikipedia Editathon: How to Guide
 
FIX IT+ Transcript Editing
FIX IT+ Transcript EditingFIX IT+ Transcript Editing
FIX IT+ Transcript Editing
 
Press Play on History: Unlocking 70 Years of Primary Source Materials for Dis...
Press Play on History: Unlocking 70 Years of Primary Source Materials for Dis...Press Play on History: Unlocking 70 Years of Primary Source Materials for Dis...
Press Play on History: Unlocking 70 Years of Primary Source Materials for Dis...
 
AV Digitization Projects: Tools and Strategies for Enhancing Impact and Engag...
AV Digitization Projects: Tools and Strategies for Enhancing Impact and Engag...AV Digitization Projects: Tools and Strategies for Enhancing Impact and Engag...
AV Digitization Projects: Tools and Strategies for Enhancing Impact and Engag...
 
Implementing Samvera Open Source Technology at WGBH and the American Archive ...
Implementing Samvera Open Source Technology at WGBH and the American Archive ...Implementing Samvera Open Source Technology at WGBH and the American Archive ...
Implementing Samvera Open Source Technology at WGBH and the American Archive ...
 
Use of American Archive of Public Broadcasting in Humanities Research
Use of American Archive of Public Broadcasting in Humanities ResearchUse of American Archive of Public Broadcasting in Humanities Research
Use of American Archive of Public Broadcasting in Humanities Research
 
American Archive of Public Broadcasting: a Digital Library for Teaching Media...
American Archive of Public Broadcasting: a Digital Library for Teaching Media...American Archive of Public Broadcasting: a Digital Library for Teaching Media...
American Archive of Public Broadcasting: a Digital Library for Teaching Media...
 
Accessibility of the American Archive of Public Broadcasting in Academic Libr...
Accessibility of the American Archive of Public Broadcasting in Academic Libr...Accessibility of the American Archive of Public Broadcasting in Academic Libr...
Accessibility of the American Archive of Public Broadcasting in Academic Libr...
 
How to Use the American Archive of Public Broadcasting as a Resource in the C...
How to Use the American Archive of Public Broadcasting as a Resource in the C...How to Use the American Archive of Public Broadcasting as a Resource in the C...
How to Use the American Archive of Public Broadcasting as a Resource in the C...
 
Putting the Pieces Together: Creating a National Educational Television Catalog
Putting the Pieces Together: Creating a National Educational Television CatalogPutting the Pieces Together: Creating a National Educational Television Catalog
Putting the Pieces Together: Creating a National Educational Television Catalog
 
DESIGN FOR CONTEXT: Cataloging, Web Design, and Linked Data for Exposing Nati...
DESIGN FOR CONTEXT: Cataloging, Web Design, and Linked Data for Exposing Nati...DESIGN FOR CONTEXT: Cataloging, Web Design, and Linked Data for Exposing Nati...
DESIGN FOR CONTEXT: Cataloging, Web Design, and Linked Data for Exposing Nati...
 
DESIGN FOR CONTEXT: Cataloging and Linked Data for Exposing National Educatio...
DESIGN FOR CONTEXT: Cataloging and Linked Data for Exposing National Educatio...DESIGN FOR CONTEXT: Cataloging and Linked Data for Exposing National Educatio...
DESIGN FOR CONTEXT: Cataloging and Linked Data for Exposing National Educatio...
 
Preserving Your Station Legacy with the American Archive of Public Broadcasti...
Preserving Your Station Legacy with the American Archive of Public Broadcasti...Preserving Your Station Legacy with the American Archive of Public Broadcasti...
Preserving Your Station Legacy with the American Archive of Public Broadcasti...
 
FIX IT - A Transcript Game to Make Historic Public Broadcasting More Discover...
FIX IT - A Transcript Game to Make Historic Public Broadcasting More Discover...FIX IT - A Transcript Game to Make Historic Public Broadcasting More Discover...
FIX IT - A Transcript Game to Make Historic Public Broadcasting More Discover...
 
Using Computational Tools and Crowdsourcing Games to Increase Metadata and Di...
Using Computational Tools and Crowdsourcing Games to Increase Metadata and Di...Using Computational Tools and Crowdsourcing Games to Increase Metadata and Di...
Using Computational Tools and Crowdsourcing Games to Increase Metadata and Di...
 
Can the Computer and the Public Do the Metadata Work?
Can the Computer and the Public Do the Metadata Work?Can the Computer and the Public Do the Metadata Work?
Can the Computer and the Public Do the Metadata Work?
 
Going Far by Going Together: Collaboration with Scholars and Other Allies
Going Far by Going Together: Collaboration with Scholars and Other AlliesGoing Far by Going Together: Collaboration with Scholars and Other Allies
Going Far by Going Together: Collaboration with Scholars and Other Allies
 
Building AAPB Participation into Digitization Grant Proposals: Requirements, ...
Building AAPB Participation into Digitization Grant Proposals: Requirements, ...Building AAPB Participation into Digitization Grant Proposals: Requirements, ...
Building AAPB Participation into Digitization Grant Proposals: Requirements, ...
 
Building the AAPB: Inter-Institutional Preservation and Access Workflows
Building the AAPB: Inter-Institutional Preservation and Access WorkflowsBuilding the AAPB: Inter-Institutional Preservation and Access Workflows
Building the AAPB: Inter-Institutional Preservation and Access Workflows
 

Recently uploaded

Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 

Recently uploaded (20)

Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 

Let the Computer Do the Work

  • 1. LET THE COMPUTER DO THE WORK Karen Cariani & Casey Davis WGBH | Boston, MA, USA
  • 2.
  • 3.
  • 4. the situation ■ 68,000 digitized television and radio programs ■ incomplete, inaccurate metadata records ■ limited staff resources ■ we need to know what we have in the collection ■ we have a responsibility to users to provide access to the collection ■ continued growth of the collection (content and sparse metadata)
  • 5.
  • 9. The State of Recorded Sound Preservation in the United States: A National Legacy at Risk in the Digital Age (2010) Suggested that if scholars and students do not use sound archives, cultural heritage institutions will be less inclined to preserve them. Archives and libraries must collaborate with patrons and scholars to understand how recordings are and might be used. Scholars need to know what kinds of analysis are possible in an age of large, freely available collections and advanced computational analysis.
  • 11. A vision “ . . . the sound file would become . . . a text for study, much like the visual document. The acoustic experience of listening to the poem would begin to compete with the visual experience of reading the poem.” Bernstein, Charles. Attack of the Difficult Poems: Essays and Inventions. University Of Chicago Press, 2011, 114.
  • 13. HiPSTAS team • Tanya Clement, [PI] Assistant Professor, University of Texas at Austin • Loretta Auvil [Co-PI] Senior Project Coordinator at the Illinois Informatics Institute (I3) at the University of Illinois at Urbana-Champaign • David Tcheng [Co-PI] Research Scientist at I3; ARLO developer • Tony Borries, Research Programmer working as a consultant with I3; ARLO programmer • David Enstrom, Biologist, University of Illinois at Urbana-Champaign; consultant
  • 14. Participants, Hipstas Institute, 2013- 2014 • 8 librarians and archivists • 9 humanities scholars • 3 advanced graduate students in humanities and information science
  • 15. Participating collections • poetry from PennSound at the University of Pennsylvania 30,000 audio files • folklore at the Dolph the Briscoe Center for American History at UT Austin, 57 feet of tapes (reels and audiocassettes) • storytelling traditions at the Native American Projects (NAP) at the American Philosophical Society in Philadelphia , 50 tribes, 3,000 hours
  • 16. • Field recordings (200,000 recordings) American Folklife Center, Library of Congress • 30, 000 hours, Oral histories, Storycorps • Speeches in the Southern Christian Leadership Conference recordings, Emory University • 700 recordings in the Elliston Poetry Collection at the University of Cincinnati • 36 interviews in the Dust, Drought and Dreams Gone Dry: Oklahoma Women and the Dust Bowl (WDB) oral history project out of the Oklahoma State Libraries OTHER COLLECTIONS OF INTEREST TO PARTICIPANTS
  • 17. To develop a virtual research environment in which users can better access and analyze spoken word collections of interest to humanists through: 1. an assessment of scholarly requirements for analyzing sound 2. an assessment of technological infrastructures needed to support discovery 3. preliminary tests that demonstrate the efficacy of using such tools in humanities scholarship 4. A freely available, open-source, API-driven version for general use HIPSTAS: PRIMARY GOALS
  • 18. ARLO (Adaptive Recognition with Layered Optimization) HZ, a unit of frequency Time a heat based color scheme. White – hottest, most intense Yellow Red Green Blue Black – coolest, least intense Energy represented by
  • 19. OpenMary LaynorStein Searching for Sound with Sound Supervised Classification
  • 25. Blue = sung; green = spoken; red = instrumental 55 John Alan Lomax recordings 1926-1941 Visualize results
  • 26. Visualize results 55 John Alan Lomax recordings 1926-1941
  • 27. Takeaways: ■ What do scholars talk about when they talk about sound? • Language dynamics: tempo, pitch, tone/timbre, volume, pace, laughter, silence, applause, moans, screams, dialects, changing speakers, gender, age, changing genres • Environment: fan hums, car horns, chickens, train whistles, bird calls, frogs mating • Materiality: recording noises, needle drops, feedback, the electronic grid, changing tracks
  • 28. ■What do engineers talk about they talk about audio? • Resolution: Bit depth, Bit rate, sample rate • Signal processing: Fast Fourier Transform (FFT) and filter banks • Dynamics: Damping ratios, gain, frequencies, spectra, energy, and pitch energy TAKEAWAYS
  • 29. ■ What do computer scientists talk about when they talk about ML? • Features: What are we measuring? • Ground Truth: What’s the answer? How do we know when we’re accurate? • Optimization: Accuracy vs. Efficiency – how do you balance the accuracy of your results against the computational resources you need to achieve that level of accuracy? Takeaways
  • 30. Takeaways • Literacy: How much do we need to know about the technology of audio, of computational methods, and of humanist inquiry to do new kinds of research in this area? • Usability: What kinds of interfaces and tools facilitate AV analysis in a diverse range of disciplines and communities? Who gets access to these tools and for what kinds of questions? • Accuracy: Is good enough, good enough? • Scalability: How much storage and processing power do users need to conduct local and large-scale AV analyses? A Laptop? A Supercomputer? • Sustainability: What are local, national, and global scale issues? How does this work fit back into the access infrastructure already in place in archives, libraries, classrooms? Is data enough to get us over the hump of our limited means for discovery?
  • 32. Computational tools ■ Language ■ Speech to text ■ Image recognition ■ Sound
  • 34. We will want to show sample files ■ Popup archive ■ Speech to text
  • 35. ■ Games to correct
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.

Editor's Notes

  1. Perform machine learning with the instance based algorithm (with distance weighting power = 15 and threshold at .4 being optimal) using inductive biased optimization to come up with configuration; all machine learning algorithms have control parameters -- you need to try out different ones to find out what is optimal for your problem (classification threshold and distance weighting power) file-based cross validation was used to measure predictive performance; simulate the process for having the ground truth from other files and see how well you can predict it; applied the model to all the files (25 million examples); For each example in the file (an example is a 1/32nd second), we classify every example. To classify the time slice: compare it to all known slices and compute the distance between it and all known slices; computing the distance for 256 bands is 256 values (feature space has 256 dimensions in it); when we compute the distance, we are computing the distance between two points in a 256-dimensional space; this means for each dimension, compute the absolute value of the difference between each feature pair and then sum all the differences using the power of 1 (city block, taxi cab distance, hamming distance [straight lines] -- [Euclidean is 2; as the crow flies]) http://taxicabgeometry.net/general/basics.html After computing the distance, convert the distance into a weight; the weight for every example is derived from its distance 1/distance(raised to a power) [which in our case is 15]. [not using other predictions from near windows, previous or subsequent, to determine classes] Use the classes of all the examples and all of their weights to determine a “vote” for the current example; formula for the vote, single class probability: sum up (the actual class [0 or 1] times its weight) then divide by the sum of all weights; This creates a weighted prediction where some examples get more weight than others; [if you set it to 0 every slice would get the same weight]; ground truth is not taken out; [you will see perfect examples in the results;]
  2. Low voice between 820 and 845
  3. Blue = sung; green = spoken; red = instrumental