Presentation 17 may morning case study 2 sarahhaye aziz

635 views
591 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
635
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Presentation 17 may morning case study 2 sarahhaye aziz

  1. 1. RSI Archive: our experience working withSpeech to Text and Semantic AnalysisSarah-Haye Aziz and Lorenzo VassalloMay 17, 2013
  2. 2. 2Come al solito, anche la recenteinaugurazione dellultima monumentaleopera di quelleccezionale scultore cheGiacomo Manzù, vale a dire la nuova portadel Duomo di Rotterdam avuto il saporedi un avvenimento straordinario dirisonanza internazionale e per lavori incorso Fabio Bonetti è riuscito ad avvicinarelinsigne maestro bergamasco, a buondiritto ritenuto ormai uno dei più altiinterpreti del nostro tempo, artista fra i piùgrandi del secolo e non solo per la misuradel suo talento ma anche per il rigoremorale di cui è sempre stato esempio inanni di sorta, tormentata, ispirataattività.Credits: Giacomo Manzù, Fabio BonettiGeographic Therms: RotterdamThemes: arte, cultura, intrattenimentoErrorsèasAudio TranscriptionhaCategorization
  3. 3. 3Outline1. Why an automatic indexing system?2. The project timeline3. Two paths: system and archivists workflow overview4. Does it work? We learned that...5. Next steps6. Some advices
  4. 4. 4Why an automatic indexing system?RSI has a consolidated cataloguing system (CMM)with a well-defined human workflow from 2008RSI has plenty undocumented historical materialand no capacity to document it.Increase (plus) the documented material addingan automation but not substituting (vs) the archivist.Not vs but plus!
  5. 5. 5Archivists and Technicians SynergyProject timelineDeploymentDeploymentTuningTuningAnalysis & StartupAnalysis & StartupWorkflow DesignWorkflow DesignLanguage ModelLanguage ModelTv & RadioProgrammes ChoiceTv & RadioProgrammes ChoiceWorkflow ReviewWorkflow ReviewTranscription TestTranscription TestSystem TestSystem Test
  6. 6. 6Documenting a material: two pathsIngestionCataloguePublishingTranscription EngineAudio + Key framesSemantic EngineAudioand VideoKey framesArchivistDocumentation+RefinementsSpeechTranscriptionText +SequencesCategorizationText + SequencesCreditsSIAThemes +Geographical thermsHumanaudio listeningandtranscription+Archivistdocumentation
  7. 7. 7The two paths for the archivistStart ?InvokeIndexingHuman Taskon CatalogueDetailed documentationManual creation oflogical sequencesAutomatedTranscription andCategorisationDetailed documentationAutomatic creation oflogical sequencesPublishDocLevelBasic HumanLimited set ofdocumented metadataHigh Humanwith AutomationLimited set ofdocumented metadataAutomatic creation oflogical sequences??Human Taskon CatalogueYesNoDocLevelHigh HumanBasic Humanwith Automation
  8. 8. 8The archivist – Francesco Veri
  9. 9. 9Does it work? Yes! But…Differences between Radio and TVBackground Music/Noise does not help the transcription.Based only on silences andwithout key frames, the systemcreates too many sequences.Key frames help to locate achange of context.Speech rhythm and pauses are different between and .
  10. 10. 10Next steps (1) – Capitalize Editorial TextsSemantic EngineCategorization CatalogueAudio+Editorial Texts
  11. 11. 11Next steps (2) – Capitalize 24h Radio Logging24h Radio Logging0 24SIA(Transcription andSemantic Engine)Transcription &CategorizationCatalogueAutomaticCut
  12. 12. 12If you... some adviceInvolve the ArchivistsTake a different approach in Radio and TVChoose the right Tv & Radio Programmes
  13. 13. 13sarah-haye.aziz@rsi.chlorenzo.vassallo@rsi.ch

×