FIAT/IFTA MMC Seminar May 2015. Giant Steps: The 2020 Broadcaster Archive. Theo Mausli SRG SSR

•Download as PPTX, PDF•

1 like•730 views

This document discusses strategies for archives from 2015-2020. It proposes that archives archive all own productions in high definition if economically sustainable, and increase harmonization, standardization, and centralization. The SRG Task Force will analyze what SRG content production will look like in 2020 and what role archives will have, develop archive policies considering production evolution and technical/economic factors, and determine how to realize policies through structures, tools, resources, and a roadmap. The document also discusses RSI's use of semantic analysis on audio/images to automatically categorize, catalog, and extract metadata for easier retrieval of material. It concludes that archives should ask industry to provide integrated standard solutions and accept common workflows.

Presentations & Public Speaking

Theo Mäusli (SGR SSR) with a contribution of Sarah-Haye Aziz, Lorenzo
Vassallo, Francesco Veri (RSI)
FIAT/IFTA Media Management Seminar 2015
Glasgow - May 21st – 22nd

‹#›
Agenda
0. Introduction, context
1. Selection
2. Dataflow
3. Traceability
4. Data mining
5. Artificial intelligence
6. The archivist is a coach
Conclusion: please, industry

‹#›
SRG Task Force about the realization of archive
strategies 2015-2020
• Archiving its entire own production in high definition, if economically sustainable
• Harmonization, standardization and centralization
• Greater use, also for externals
• …

‹#›
analysis in 4 steps:
1. What will be the SRG content production in 2020?
What will be the rule of the archives in such
context?
2. What will be the archive‘s policies, considering
the evolution in SRG production and the macro
evolution of the technical/economical factors?
3. How can these policies be realized (structures,
instruments, technical and human resources)
4. Scenarios, roadmap and cost evaluations

‹#›
Conclusion 1: this is not new
Many archives are already behaving as a 2020
archive, using dataflows, data mining and new
skills (very dinamical experiences by RTS and
SRF)
The case of RSI (upadates of the last seminar
contribution by Sarah, Lorenzo and Francesco)

‹#›
Updates on SIA at RSI
Sarah-Haye Aziz, Lorenzo Vassallo, Francesco Veri
May 2015

‹#›
SIA’s sources in TV and Radio (2011-2012)
 Audio and Images
 Audio

‹#›
…
Differences between Radio and TV
Background Music/Noise  does not help the transcription.
Based only on silences and
without key frames, the system
creates too many sequences.
Key frames help to locate a
change of context.
Speech rhythm and pauses are different between and .

‹#›
Solution – Capitalize 24h Radio Logging
24h Radio Logging
0 24
Semantic Engine
Semantic Analysis:
Categorisation
Catalogue
Automatic
Cut
+ Editorial Text
Editorial Text Themes
Geo Ref
Credits+

‹#›
User download from 24h logging
• Audio material retrievable using metadata automatically acquired from the channel schedule
• Download from the 24 h Radio logging

‹#›
Capitalize editorial texts
• Automatic Semantic Analysis of the editorial text
• Automatic uploading of the editorial texts on CMM
• Terms not related to our thesaurus become keywords

‹#›
Further implementation
• This solution can be implemented on webpages
• Automatic semantic analysis of web content
Automatic categorisation of the webpages content

20
Hello everybody!
Contact us for
more information!Well done
Theo!
francesco.veri@rsi.ch sarah-haye.aziz@rsi.chlorenzo.vassallo@rsi.ch

‹#›
Conclusion: please, industry
• The features are not new, but they are not integrated into the archive
systems yet
• It is not the job of the archivists to integrate them
• The archivist comunity should ask the industry to provide standard
solutions
• The archive comunity has to accept comun workflows and standards
(learned yesterday!)

‹#›
Conclusion: …. and collaboration
SRG SSR looks forward to collaborate and
benchmark with FIAT/IFTA community
thank you for your attention
Theo Mäusli, Lugano, Switzerland
Theo.maeusli@srgssr.ch
+41 91 803 51 28

Viewers also liked

FIAT/IFTA MMC Seminar May 2015. Content factory for social and digital media....FIAT/IFTA

Audio-Visual and Data Preservation, Rune Bjerkestrand, Managing Director | Ci...FIAT/IFTA

FIAT/IFTA MMC Seminar May 2015. FIAT/IFTA MAM SurveyFIAT/IFTA

Niet geweest session 241014FIAT/IFTA

FIAT/IFTA MMC Seminar May 2015. Audiovisual content management through the MA...FIAT/IFTA

FIAT/IFTA MMC Seminar May 2015. Metadata Standards for Preservation and Quali...FIAT/IFTA

Selling Content in a Changing Economy, Jean-luc Vernhet & Jean Louis RolleFIAT/IFTA

Archive Innovation Management - A Future Key Position in Media Innovation or ...FIAT/IFTA

NHK's New Archives System, Nobuhisa Yamashita, NHKFIAT/IFTA

Matthijs leendertse, Special Guest Star Speaker at the FIAT/IFTA World Confer...FIAT/IFTA

FIAT/IFTA MMC Seminar May 2015. Key Points for a Successful Migration. Fikriy...FIAT/IFTA

Media management and thesaurus use in the production environment, Tom de Smet...FIAT/IFTA

Broadcast Anniversaries as Key Elements of Media History - A Research Study, ...FIAT/IFTA

Capital numerique, Pierre Barrot & Fatoumata Coulibaly & Amoussa Achabi SaratouFIAT/IFTA

Europeana Uncensored Keynote at FIAT/IFTA World Conference 2014, Harry Verway...FIAT/IFTA

PARTICIPATORY ARCHIVES: INTERACTING WITH AUDIENCES, Marion Dupeyrat, InaFIAT/IFTA

172.000 hours digitization project. Globosat, Marcia di SimoniFIAT/IFTA

Data Journalism and Social Media, Media Archives as Information Service Provi...FIAT/IFTA

Viewers also liked (18)

FIAT/IFTA MMC Seminar May 2015. Content factory for social and digital media....

Audio-Visual and Data Preservation, Rune Bjerkestrand, Managing Director | Ci...

FIAT/IFTA MMC Seminar May 2015. FIAT/IFTA MAM Survey

Niet geweest session 241014

FIAT/IFTA MMC Seminar May 2015. Audiovisual content management through the MA...

FIAT/IFTA MMC Seminar May 2015. Metadata Standards for Preservation and Quali...

Selling Content in a Changing Economy, Jean-luc Vernhet & Jean Louis Rolle

Archive Innovation Management - A Future Key Position in Media Innovation or ...

NHK's New Archives System, Nobuhisa Yamashita, NHK

Matthijs leendertse, Special Guest Star Speaker at the FIAT/IFTA World Confer...

FIAT/IFTA MMC Seminar May 2015. Key Points for a Successful Migration. Fikriy...

Media management and thesaurus use in the production environment, Tom de Smet...

Broadcast Anniversaries as Key Elements of Media History - A Research Study, ...

Capital numerique, Pierre Barrot & Fatoumata Coulibaly & Amoussa Achabi Saratou

Europeana Uncensored Keynote at FIAT/IFTA World Conference 2014, Harry Verway...

PARTICIPATORY ARCHIVES: INTERACTING WITH AUDIENCES, Marion Dupeyrat, Ina

172.000 hours digitization project. Globosat, Marcia di Simoni

Data Journalism and Social Media, Media Archives as Information Service Provi...

Similar to FIAT/IFTA MMC Seminar May 2015. Giant Steps: The 2020 Broadcaster Archive. Theo Mausli SRG SSR

Orchestration, Automation and Virtualisation (OAV) in GÉANTCSUC - Consorci de Serveis Universitaris de Catalunya

Green gupta 20 years of mmcFIAT/IFTA

A Space X Industry Day Briefing 7 Jul08 Jgm R4jmorriso

Preparing for IPv6 implementation using Artificial Intelligence (AI) presenta...APNIC

Séminaire Big Data Alter Way - Elasticsearch - octobre 2014ALTER WAY

btNOG 10: Preparing for IPv6 implementation using AIAPNIC

Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...Databricks

IT Architectures for Handling Big Data in Official Statistics: the Case of Sc...Istituto nazionale di statistica

Building your big data solution WSO2

ESSnet Big Data WP8 Methodology (+ Quality, +IT)Piet J.H. Daas

How to be data savvy managerTOSHI STATS Co.,Ltd.

DOES15 - Mirco Hering - Adopting DevOps Practices for Systems of Record – An ...Gene Kim

Mirco hering devops for systems of record finalMirco Hering

Web Annotations – A Game Changer for Language Technology?Georg Rehm

SKOS as the focal point of linked data strategiesSemantic Web Company

TYPO3 6.2 LTS - TYPO3 Conference Stuttgart, 2013Ernesto Baschny

Qbt nlp en_2014Qbtsagl3

!#$&()&#+,$)!#$$&())• +,-.$0$12,#-34-$#3.docxkatherncarlyle

SOA is Dead, long live SOA !Matthias Furrer

Using Archivemedia to preserve research dataARDC

Similar to FIAT/IFTA MMC Seminar May 2015. Giant Steps: The 2020 Broadcaster Archive. Theo Mausli SRG SSR (20)

Orchestration, Automation and Virtualisation (OAV) in GÉANT

Green gupta 20 years of mmc

A Space X Industry Day Briefing 7 Jul08 Jgm R4

Preparing for IPv6 implementation using Artificial Intelligence (AI) presenta...

Séminaire Big Data Alter Way - Elasticsearch - octobre 2014

btNOG 10: Preparing for IPv6 implementation using AI

Streaming Trend Discovery: Real-Time Discovery in a Sea of Events with Scott ...

IT Architectures for Handling Big Data in Official Statistics: the Case of Sc...

Building your big data solution

ESSnet Big Data WP8 Methodology (+ Quality, +IT)

How to be data savvy manager

DOES15 - Mirco Hering - Adopting DevOps Practices for Systems of Record – An ...

Mirco hering devops for systems of record final

Web Annotations – A Game Changer for Language Technology?

SKOS as the focal point of linked data strategies

TYPO3 6.2 LTS - TYPO3 Conference Stuttgart, 2013

Qbt nlp en_2014

!#$&()&#+,$)!#$$&())• +,-.$0$12,#-34-$#3.docx

SOA is Dead, long live SOA !

Using Archivemedia to preserve research data

Recently uploaded

Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Krijn Poppe

NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)Basil Achie

OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...NETWAYS

Philippine History cavite Mutiny Report.pptssuser319dad

The 3rd Intl. Workshop on NL-based Software EngineeringSebastiano Panichella

Genesis part 2 Isaiah Scudder 04-24-2024.pptxFamilyWorshipCenterD

Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...NETWAYS

call girls in delhi malviya nagar @9811711561@vikas rana

Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝soniya singh

CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...henrik385807

Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSebastiano Panichella

Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4

SBFT Tool Competition 2024 -- Python Test Case Generation TrackSebastiano Panichella

Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Salam Al-Karadaghi

OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...NETWAYS

Work Remotely with Confluence ACE 2.pptxmavinoikein

Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝soniya singh

Microsoft Copilot AI for Everyone - created by AITatiana Gurgel

CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfhenrik385807

OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...NETWAYS

Recently uploaded (20)

Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...

NATIONAL ANTHEMS OF AFRICA (National Anthems of Africa)

OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...

Philippine History cavite Mutiny Report.ppt

The 3rd Intl. Workshop on NL-based Software Engineering

Genesis part 2 Isaiah Scudder 04-24-2024.pptx

Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...

call girls in delhi malviya nagar @9811711561@

Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝

CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...

Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist

Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata

SBFT Tool Competition 2024 -- Python Test Case Generation Track

Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...

OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...

Work Remotely with Confluence ACE 2.pptx

Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝

Microsoft Copilot AI for Everyone - created by AI

CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf

OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...

FIAT/IFTA MMC Seminar May 2015. Giant Steps: The 2020 Broadcaster Archive. Theo Mausli SRG SSR

1. Theo Mäusli (SGR SSR) with a contribution of Sarah-Haye Aziz, Lorenzo Vassallo, Francesco Veri (RSI) FIAT/IFTA Media Management Seminar 2015 Glasgow - May 21st – 22nd

2. ‹#› Agenda 0. Introduction, context 1. Selection 2. Dataflow 3. Traceability 4. Data mining 5. Artificial intelligence 6. The archivist is a coach Conclusion: please, industry

3. ‹#› SRG Task Force about the realization of archive strategies 2015-2020 • Archiving its entire own production in high definition, if economically sustainable • Harmonization, standardization and centralization • Greater use, also for externals • …

4. ‹#› analysis in 4 steps: 1. What will be the SRG content production in 2020? What will be the rule of the archives in such context? 2. What will be the archive‘s policies, considering the evolution in SRG production and the macro evolution of the technical/economical factors? 3. How can these policies be realized (structures, instruments, technical and human resources) 4. Scenarios, roadmap and cost evaluations

5. ‹#› 6 main hypotheses:

6. ‹#›

7. ‹#›

8. ‹#›

9. ‹#›

10. ‹#›

11. ‹#›

12. ‹#› Conclusion 1: this is not new Many archives are already behaving as a 2020 archive, using dataflows, data mining and new skills (very dinamical experiences by RTS and SRF) The case of RSI (upadates of the last seminar contribution by Sarah, Lorenzo and Francesco)

13. ‹#› Updates on SIA at RSI Sarah-Haye Aziz, Lorenzo Vassallo, Francesco Veri May 2015

14. ‹#› SIA’s sources in TV and Radio (2011-2012)  Audio and Images  Audio

15. ‹#› … Differences between Radio and TV Background Music/Noise  does not help the transcription. Based only on silences and without key frames, the system creates too many sequences. Key frames help to locate a change of context. Speech rhythm and pauses are different between and .

16. ‹#› Solution – Capitalize 24h Radio Logging 24h Radio Logging 0 24 Semantic Engine Semantic Analysis: Categorisation Catalogue Automatic Cut + Editorial Text Editorial Text Themes Geo Ref Credits+

17. ‹#› User download from 24h logging • Audio material retrievable using metadata automatically acquired from the channel schedule • Download from the 24 h Radio logging

18. ‹#› Capitalize editorial texts • Automatic Semantic Analysis of the editorial text • Automatic uploading of the editorial texts on CMM • Terms not related to our thesaurus become keywords

19. ‹#› Further implementation • This solution can be implemented on webpages • Automatic semantic analysis of web content Automatic categorisation of the webpages content

20. 20 Hello everybody! Contact us for more information!Well done Theo! francesco.veri@rsi.ch sarah-haye.aziz@rsi.chlorenzo.vassallo@rsi.ch

21. ‹#› Conclusion: please, industry • The features are not new, but they are not integrated into the archive systems yet • It is not the job of the archivists to integrate them • The archivist comunity should ask the industry to provide standard solutions • The archive comunity has to accept comun workflows and standards (learned yesterday!)

22. ‹#› Conclusion: …. and collaboration SRG SSR looks forward to collaborate and benchmark with FIAT/IFTA community thank you for your attention Theo Mäusli, Lugano, Switzerland Theo.maeusli@srgssr.ch +41 91 803 51 28

Editor's Notes

Giant steps: the 2020 broadcasters archive In the process of an efficient managing of metadata the way to 2020 will be done by giant steps, considering concrete experience inside of broadcasters and archives, considering also existing prototypes and credible research and industry announcements. We believe that relevant changings will mostly happen on 6 levels: Selection: by always lower cost of memory selection happens no longer on the level of what to keep, but of what to invest in metadata. Dataflow: Its based (basic?) production workflows permit to converge all production data into metadata. Traceability: also the use of the archivocuments (and of parts of it) will be a future dynamic metadata. Data mining: Speech to text, image and face recognition techniques permit to understand and find pertinent content. Big data analyzing (or analysis?) provides furnishes contextual information, if well selected filtered and organized. Artificial intelligence furnishes provides the (relevant?) pertinent document to the users (requests?) demands – and suggests pertinent (requests?) demands. The new archivist is a coach: she/he will teach and train the archive systems to keep and to emphasize the interesting information and to provide furnish the right material. We expect aspect from the industry good integrated products so that these giant steps will be sure (definitive?) steps. FIAT/IFTA can motivate the industry to do so and coordinate the archivists' requests and ideas.
Giant steps: the 2020 broadcasters archive In the process of an efficient managing of metadata the way to 2020 will be done by giant steps, considering concrete experience inside of broadcasters and archives, considering also existing prototypes and credible research and industry announcements. We believe that relevant changings will happen mostly on 6 levels: Selection: by always lower cost of memory selection happens no longer on the level of what to keep, but of what to invest in metadata. Dataflow: It-based production workflows permit to converge all production data into metadata. Traceability: also the use of the archives documents (and of parts of it) will be a future dynamic metadata. Data mining: Speech to text, image and face recognition techniques permit to understand and find pertinent content. Big data analyzing furnishes contextual information, if well filtered and organized. Artificial intelligence furnishes the pertinent documents to the users demands – and suggests pertinent demands. The new archivist is a coach: she/he will teach and train the archive systems to keep and to emphasize the interesting information and to furnish the right material. We aspect from the industry good integrated products so that these giant steps will be sure steps. FIAT/IFTA can motivate the industry to do so and coordinate the archivists requests and ideas.
Welcome …. before I start, I would like to emphasis that this presentation is a brief update of the Automatic Indexing System used in the RSI Radio archives. Two years ago during the FIAT/IFTA conference in Amsterdam we presented on the implementation of the Automatic Indexing System in TV and its impact on the daily work of archivists. At that time our Automatic indexing system received good feedback by several institutions which contacted us for further information. In particular they were interested in the synergy between IT staff and archivists during the tuning phase of the Automatic Indexing system – in which archivists plays a central role in developing certain improvement such as using different colors in order to distinguish between human and automatic indexing -, and the transformation of the work of the archivist that now has the choice between different paths of documentation: (or that) a human documentation, an automatic documentation and an automatic and human documentation. For those who were not in Amsterdam two years ago I would like to reiterate that in 2011 the Radiotelevisione della Svizzera Italiana (RSI) (Swiss Italian Broadcast), introduced an Automatic Indexing system – consisting of the automatic transcription of audio sources (Speech to Text) and also the automatic semantic analysis - in its archives Multimedia Catalogue, which is called CMM. Today we would like to present on how the automatic indexing system has developed in the context of the radio. From now on, when I refer to CMM I am referring to RSI’s archiving database and when I talk about the Automatic Indexing System I will use the Italian acronym SIA (Sistema d’Indicizzazione Automatica).
Silde 2: SIA’s sources in TV and Radio In 2011-2012 we implemented the SIA in both radio and TV. The automatic transcription’s sources in the Radio was based on the audio and in the TV it was based on audio and images.
Slide 3: Differences between Radio and TV We noticed that the Automatic Transcriptions of Radio and TV were radically different due to the fact that radio: has background noise or music which interferes with the Speech to text and creates inaccurate transcriptions. Secondly, Radio and TV are different typologies of media with different narrative language, due to a diverse significance of speech pauses during programs Therefore the sequences engendered by the SIA from radio were based on silences resulting in too many sequences. By contrast, the SIA in TV works well because of the presence of images. The key frames in video help to locate a change of context and thus creates logical sequences in TV documents For example in a reportage with different interviewees the change of key frame from a person to another helps the SIA to create separate sequences for each person, therefore the system automatically put the full interview for each person in sequence. In other words we considered too hazardous to use one common technical solution for TV and Radio.
Slide 4: Solution capitalize 24 Radio Logging In order to deal with the radio’s pitfall we now implement the SIA on pre-existent editorial texts and add these texts as attachments to our multimedia catalogue.
Slide 5: User download from 24 h logging From the user’s point of view we have directly implemented the metadata automation within the broadcast programming of the day (24 Radio logging), which includes the editorial text written by the journalists. The system automatically retrieves the material using metadata acquired from the channel schedule (such as titles and the playout’s dates)
Slide6: Capitilze editorial texts From the broadcast programming the system automatically picks up the editorial texts and operates a semantic analysis of the text – which consists of the extrapolation of credits, geographical terms and terms that are linked to our thesaurus -which is automatically uploaded and attached to the respective audio file on our database. Other terms, that are not linked to our thesaurus, are also extrapolated and they will appear as keywords. Basically the Automatic Indexing System also automatically tags the text.
Slide 8: further implementation of the Automatic Semantic Analysis Automatic Semantic analysis has several implementation options. For example its use on web pages In this case, the system operates an automatic semantic analysis of the content of the web pages which results in the automatic tagging of the web page. This will facilitate the process of categorization of the content of the web pages though keywords

FIAT/IFTA MMC Seminar May 2015. Giant Steps: The 2020 Broadcaster Archive. Theo Mausli SRG SSR

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (18)

Similar to FIAT/IFTA MMC Seminar May 2015. Giant Steps: The 2020 Broadcaster Archive. Theo Mausli SRG SSR

Similar to FIAT/IFTA MMC Seminar May 2015. Giant Steps: The 2020 Broadcaster Archive. Theo Mausli SRG SSR (20)

More from FIAT/IFTA

More from FIAT/IFTA (20)

Recently uploaded

Recently uploaded (20)

FIAT/IFTA MMC Seminar May 2015. Giant Steps: The 2020 Broadcaster Archive. Theo Mausli SRG SSR

Editor's Notes