SlideShare a Scribd company logo
1 of 13
RSI Archive: our experience working with
Speech to Text and Semantic Analysis
Sarah-Haye Aziz and Lorenzo Vassallo
May 17, 2013
2
Come al solito, anche la recente
inaugurazione dell'ultima monumentale
opera di quell'eccezionale scultore che
Giacomo Manzù, vale a dire la nuova porta
del Duomo di Rotterdam avuto il sapore
di un avvenimento straordinario di
risonanza internazionale e per lavori in
corso Fabio Bonetti è riuscito ad avvicinare
l'insigne maestro bergamasco, a buon
diritto ritenuto ormai uno dei più alti
interpreti del nostro tempo, artista fra i più
grandi del secolo e non solo per la misura
del suo talento ma anche per il rigore
morale di cui è sempre stato esempio in
anni di sorta, tormentata, ispirata
attività.
Credits: Giacomo Manzù, Fabio Bonetti
Geographic Therms: Rotterdam
Themes: arte, cultura, intrattenimento
Errors
è
as
Audio Transcription
ha
Categorization
3
Outline
1. Why an automatic indexing system?
2. The project timeline
3. Two paths: system and archivists workflow overview
4. Does it work? We learned that...
5. Next steps
6. Some advices
4
Why an automatic indexing system?
RSI has a consolidated cataloguing system (CMM)
with a well-defined human workflow from 2008
RSI has plenty undocumented historical material
and no capacity to document it.
Increase (plus) the documented material adding
an automation but not substituting (vs) the archivist.
Not vs but plus!
5
Archivists and Technicians Synergy
Project timeline
DeploymentDeploymentTuningTuningAnalysis & StartupAnalysis & Startup
Workflow DesignWorkflow Design
Language ModelLanguage Model
Tv & Radio
Programmes Choice
Tv & Radio
Programmes Choice
Workflow ReviewWorkflow Review
Transcription TestTranscription Test
System TestSystem Test
6
Documenting a material: two paths
Ingestion
Catalogue
Publishing
Transcription Engine
Audio + Key frames
Semantic Engine
Audio
and Video
Key frames
Archivist
Documentation
+
Refinements
Speech
Transcription
Text +
Sequences
Categorization
Text + Sequences
Credits
SIA
Themes +
Geographical therms
Human
audio listening
and
transcription
+
Archivist
documentation
7
The two paths for the archivist
Start ?
Invoke
Indexing
Human Task
on Catalogue
Detailed documentation
Manual creation of
logical sequences
Automated
Transcription and
Categorisation
Detailed documentation
Automatic creation of
logical sequences
Publish
Doc
Level
Basic Human
Limited set of
documented metadata
High Human
with Automation
Limited set of
documented metadata
Automatic creation of
logical sequences
?
?
Human Task
on Catalogue
Yes
No
Doc
Level
High Human
Basic Human
with Automation
8
The archivist – Francesco Veri
9
Does it work? Yes! But…
Differences between Radio and TV
Background Music/Noise does not help the transcription.
Based only on silences and
without key frames, the system
creates too many sequences.
Key frames help to locate a
change of context.
Speech rhythm and pauses are different between and .
10
Next steps (1) – Capitalize Editorial Texts
Semantic Engine
Categorization CatalogueAudio
+
Editorial Texts
11
Next steps (2) – Capitalize 24h Radio Logging
24h Radio Logging
0 24
SIA
(Transcription and
Semantic Engine)
Transcription &
Categorization
Catalogue
Automatic
Cut
12
If you... some advice
Involve the Archivists
Take a different approach in Radio and TV
Choose the right Tv & Radio Programmes
13
sarah-haye.aziz@rsi.ch
lorenzo.vassallo@rsi.ch

More Related Content

Similar to Presentation 17 may morning case study 2 sarahhaye aziz

Muehlberger - PrestoPrime case study 2 @EUscreen Mykonos
Muehlberger - PrestoPrime case study 2 @EUscreen MykonosMuehlberger - PrestoPrime case study 2 @EUscreen Mykonos
Muehlberger - PrestoPrime case study 2 @EUscreen MykonosEUscreen
 
Apache Stanbol Incubation Proposal
Apache Stanbol Incubation ProposalApache Stanbol Incubation Proposal
Apache Stanbol Incubation ProposalBertrand Delacretaz
 
F/LOSS in Norwegian libraries
F/LOSS in Norwegian librariesF/LOSS in Norwegian libraries
F/LOSS in Norwegian librariesLibriotech
 
OSMC 2014 | Processing millions of logs with Logstash and integrating with El...
OSMC 2014 | Processing millions of logs with Logstash and integrating with El...OSMC 2014 | Processing millions of logs with Logstash and integrating with El...
OSMC 2014 | Processing millions of logs with Logstash and integrating with El...NETWAYS
 
PHP is the king, nodejs is the prince and Python is the fool - Alessandro Cin...
PHP is the king, nodejs is the prince and Python is the fool - Alessandro Cin...PHP is the king, nodejs is the prince and Python is the fool - Alessandro Cin...
PHP is the king, nodejs is the prince and Python is the fool - Alessandro Cin...Codemotion
 
PHP is the King, nodejs the prince and python the fool
PHP is the King, nodejs the prince and python the foolPHP is the King, nodejs the prince and python the fool
PHP is the King, nodejs the prince and python the foolAlessandro Cinelli (cirpo)
 
Searching information in a collection of video-lectures
Searching information in a collection of video-lecturesSearching information in a collection of video-lectures
Searching information in a collection of video-lecturesronchet
 
IETF remote participation via Meetecho @ WebRTC Meetup Stockholm
IETF remote participation via Meetecho @ WebRTC Meetup StockholmIETF remote participation via Meetecho @ WebRTC Meetup Stockholm
IETF remote participation via Meetecho @ WebRTC Meetup StockholmLorenzo Miniero
 
OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...
OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...
OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...NETWAYS
 
Multilingualism ifla 2014 08
Multilingualism ifla 2014 08Multilingualism ifla 2014 08
Multilingualism ifla 2014 08Janifer Gatenby
 
WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21Lorenzo Miniero
 
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...Andrea Bollini
 
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...4Science
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenVladimir Alexiev, PhD, PMP
 
Lectures On Demand: delivering traditional lectures over the web
Lectures On Demand: delivering traditional lectures over the webLectures On Demand: delivering traditional lectures over the web
Lectures On Demand: delivering traditional lectures over the webronchet
 
DepositMOre: Applying tools to increase full-text content in institutional re...
DepositMOre: Applying tools to increase full-text content in institutional re...DepositMOre: Applying tools to increase full-text content in institutional re...
DepositMOre: Applying tools to increase full-text content in institutional re...depositMO
 
Visualizing Developer Interactions [VISSOFT2014]
Visualizing Developer Interactions [VISSOFT2014]Visualizing Developer Interactions [VISSOFT2014]
Visualizing Developer Interactions [VISSOFT2014]Roberto Minelli
 
Miniproject audioenhancement-100223094301-phpapp02
Miniproject audioenhancement-100223094301-phpapp02Miniproject audioenhancement-100223094301-phpapp02
Miniproject audioenhancement-100223094301-phpapp02mohankota
 

Similar to Presentation 17 may morning case study 2 sarahhaye aziz (20)

Muehlberger - PrestoPrime case study 2 @EUscreen Mykonos
Muehlberger - PrestoPrime case study 2 @EUscreen MykonosMuehlberger - PrestoPrime case study 2 @EUscreen Mykonos
Muehlberger - PrestoPrime case study 2 @EUscreen Mykonos
 
Apache Stanbol Incubation Proposal
Apache Stanbol Incubation ProposalApache Stanbol Incubation Proposal
Apache Stanbol Incubation Proposal
 
F/LOSS in Norwegian libraries
F/LOSS in Norwegian librariesF/LOSS in Norwegian libraries
F/LOSS in Norwegian libraries
 
OSMC 2014 | Processing millions of logs with Logstash and integrating with El...
OSMC 2014 | Processing millions of logs with Logstash and integrating with El...OSMC 2014 | Processing millions of logs with Logstash and integrating with El...
OSMC 2014 | Processing millions of logs with Logstash and integrating with El...
 
Knoxbug2016
Knoxbug2016Knoxbug2016
Knoxbug2016
 
PHP is the king, nodejs is the prince and Python is the fool - Alessandro Cin...
PHP is the king, nodejs is the prince and Python is the fool - Alessandro Cin...PHP is the king, nodejs is the prince and Python is the fool - Alessandro Cin...
PHP is the king, nodejs is the prince and Python is the fool - Alessandro Cin...
 
PHP is the King, nodejs the prince and python the fool
PHP is the King, nodejs the prince and python the foolPHP is the King, nodejs the prince and python the fool
PHP is the King, nodejs the prince and python the fool
 
Searching information in a collection of video-lectures
Searching information in a collection of video-lecturesSearching information in a collection of video-lectures
Searching information in a collection of video-lectures
 
IETF remote participation via Meetecho @ WebRTC Meetup Stockholm
IETF remote participation via Meetecho @ WebRTC Meetup StockholmIETF remote participation via Meetecho @ WebRTC Meetup Stockholm
IETF remote participation via Meetecho @ WebRTC Meetup Stockholm
 
OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...
OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...
OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...
 
Multilingualism ifla 2014 08
Multilingualism ifla 2014 08Multilingualism ifla 2014 08
Multilingualism ifla 2014 08
 
WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21
 
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
 
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
 
Lectures On Demand: delivering traditional lectures over the web
Lectures On Demand: delivering traditional lectures over the webLectures On Demand: delivering traditional lectures over the web
Lectures On Demand: delivering traditional lectures over the web
 
DepositMOre: Applying tools to increase full-text content in institutional re...
DepositMOre: Applying tools to increase full-text content in institutional re...DepositMOre: Applying tools to increase full-text content in institutional re...
DepositMOre: Applying tools to increase full-text content in institutional re...
 
Visualizing Developer Interactions [VISSOFT2014]
Visualizing Developer Interactions [VISSOFT2014]Visualizing Developer Interactions [VISSOFT2014]
Visualizing Developer Interactions [VISSOFT2014]
 
Mini Project- Audio Enhancement
Mini Project- Audio EnhancementMini Project- Audio Enhancement
Mini Project- Audio Enhancement
 
Miniproject audioenhancement-100223094301-phpapp02
Miniproject audioenhancement-100223094301-phpapp02Miniproject audioenhancement-100223094301-phpapp02
Miniproject audioenhancement-100223094301-phpapp02
 

More from Nederlands Instituut voor Beeld en Geluid

More from Nederlands Instituut voor Beeld en Geluid (11)

Presentation 17 may morning keynote cees snoek
Presentation 17 may morning keynote cees snoekPresentation 17 may morning keynote cees snoek
Presentation 17 may morning keynote cees snoek
 
Presentation 17 may afternoon casestudy 2 liam wylie
Presentation 17 may afternoon casestudy 2 liam wyliePresentation 17 may afternoon casestudy 2 liam wylie
Presentation 17 may afternoon casestudy 2 liam wylie
 
Presentation 16 may casestudy 2 evalisgreen kaisa unander
Presentation 16 may casestudy 2 evalisgreen kaisa unanderPresentation 16 may casestudy 2 evalisgreen kaisa unander
Presentation 16 may casestudy 2 evalisgreen kaisa unander
 
Presentation 16 may morning semantic linking rutger verhoeven
Presentation 16 may morning semantic linking rutger verhoevenPresentation 16 may morning semantic linking rutger verhoeven
Presentation 16 may morning semantic linking rutger verhoeven
 
Presentation 16 may morning casestudy 2 xavier jacques jourion
Presentation 16 may morning casestudy 2 xavier jacques jourionPresentation 16 may morning casestudy 2 xavier jacques jourion
Presentation 16 may morning casestudy 2 xavier jacques jourion
 
Presentation 16 may morning casestudy 1 maarten de rijke
Presentation 16 may morning casestudy 1 maarten de rijkePresentation 16 may morning casestudy 1 maarten de rijke
Presentation 16 may morning casestudy 1 maarten de rijke
 
Presentation 16 may morning keynote seth van hooland
Presentation 16 may morning keynote seth van hoolandPresentation 16 may morning keynote seth van hooland
Presentation 16 may morning keynote seth van hooland
 
Presentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenbergPresentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenberg
 
Presentation 16 may casestudy daniel steinmeier
Presentation 16 may casestudy daniel steinmeierPresentation 16 may casestudy daniel steinmeier
Presentation 16 may casestudy daniel steinmeier
 
Presentation 16 may casestudy 2 evalisgreen kaisa unander
Presentation 16 may casestudy 2 evalisgreen kaisa unanderPresentation 16 may casestudy 2 evalisgreen kaisa unander
Presentation 16 may casestudy 2 evalisgreen kaisa unander
 
Presentation 16 may archive achievements awards tom de smet
Presentation 16 may archive achievements awards tom de smetPresentation 16 may archive achievements awards tom de smet
Presentation 16 may archive achievements awards tom de smet
 

Recently uploaded

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Presentation 17 may morning case study 2 sarahhaye aziz

  • 1. RSI Archive: our experience working with Speech to Text and Semantic Analysis Sarah-Haye Aziz and Lorenzo Vassallo May 17, 2013
  • 2. 2 Come al solito, anche la recente inaugurazione dell'ultima monumentale opera di quell'eccezionale scultore che Giacomo Manzù, vale a dire la nuova porta del Duomo di Rotterdam avuto il sapore di un avvenimento straordinario di risonanza internazionale e per lavori in corso Fabio Bonetti è riuscito ad avvicinare l'insigne maestro bergamasco, a buon diritto ritenuto ormai uno dei più alti interpreti del nostro tempo, artista fra i più grandi del secolo e non solo per la misura del suo talento ma anche per il rigore morale di cui è sempre stato esempio in anni di sorta, tormentata, ispirata attività. Credits: Giacomo Manzù, Fabio Bonetti Geographic Therms: Rotterdam Themes: arte, cultura, intrattenimento Errors è as Audio Transcription ha Categorization
  • 3. 3 Outline 1. Why an automatic indexing system? 2. The project timeline 3. Two paths: system and archivists workflow overview 4. Does it work? We learned that... 5. Next steps 6. Some advices
  • 4. 4 Why an automatic indexing system? RSI has a consolidated cataloguing system (CMM) with a well-defined human workflow from 2008 RSI has plenty undocumented historical material and no capacity to document it. Increase (plus) the documented material adding an automation but not substituting (vs) the archivist. Not vs but plus!
  • 5. 5 Archivists and Technicians Synergy Project timeline DeploymentDeploymentTuningTuningAnalysis & StartupAnalysis & Startup Workflow DesignWorkflow Design Language ModelLanguage Model Tv & Radio Programmes Choice Tv & Radio Programmes Choice Workflow ReviewWorkflow Review Transcription TestTranscription Test System TestSystem Test
  • 6. 6 Documenting a material: two paths Ingestion Catalogue Publishing Transcription Engine Audio + Key frames Semantic Engine Audio and Video Key frames Archivist Documentation + Refinements Speech Transcription Text + Sequences Categorization Text + Sequences Credits SIA Themes + Geographical therms Human audio listening and transcription + Archivist documentation
  • 7. 7 The two paths for the archivist Start ? Invoke Indexing Human Task on Catalogue Detailed documentation Manual creation of logical sequences Automated Transcription and Categorisation Detailed documentation Automatic creation of logical sequences Publish Doc Level Basic Human Limited set of documented metadata High Human with Automation Limited set of documented metadata Automatic creation of logical sequences ? ? Human Task on Catalogue Yes No Doc Level High Human Basic Human with Automation
  • 8. 8 The archivist – Francesco Veri
  • 9. 9 Does it work? Yes! But… Differences between Radio and TV Background Music/Noise does not help the transcription. Based only on silences and without key frames, the system creates too many sequences. Key frames help to locate a change of context. Speech rhythm and pauses are different between and .
  • 10. 10 Next steps (1) – Capitalize Editorial Texts Semantic Engine Categorization CatalogueAudio + Editorial Texts
  • 11. 11 Next steps (2) – Capitalize 24h Radio Logging 24h Radio Logging 0 24 SIA (Transcription and Semantic Engine) Transcription & Categorization Catalogue Automatic Cut
  • 12. 12 If you... some advice Involve the Archivists Take a different approach in Radio and TV Choose the right Tv & Radio Programmes