Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
Anne Couteux - Ina
The French National Audiovisual Institute,
missions and audiovisual collections
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
Ina’s missions
1975 : creation of Ina, french National Audiovisual Institute
Missions
• Preservation of the audiovisual heritage of ORTF (French radio and
television French Broadcasting Authority, 1964-1974)
• Training, R&D
• Production, audiovisual creation
127 million euros budget in 2016
972 employees (2015)
14 700 000 hours of TV and radio documents
1 200 000 preserved photographs
123 television channels and radio stations picked up under legal
deposit
1 232 096 hours of online archives available on inamediapro.com
3 000 professionals trained each year
156,7 millions views on ina.fr
Bry sur Marne
Paris
Lille
Rennes
Strasbourg
Lyon
MarseilleToulouse
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
1986 - Ina is entitled to sell archives for new audiovisual productions :
Ina’s collections
• Public TV programs (from 1949)
• Public Radio programs (from 1902)
 Target audience : directors,
journalists, institutions… needing
footage
 Use : commercial
 Access :
Inamediapro
https://www.inamediapro.com/
Ina.fr (selection of media free of rights)
http://www.ina.fr
The audiovisual heritage of ORTF (Office de radio et télévision française)
Professional
archives
database
1,2 M hours of radio and TV programs
Ina’s collections
1995 : application of a 1992 act relative to the legal deposit of
french TV and radio programs (public and private)
• 7 TV channels in 1995 > 100 channels in 2016
• 6 radio channels in 1995 > 20 channels in 2016 (soon 64)
 Target audience : educational,
academic population
 Use : patrimonial, scientific
(researches, studies)
 Access : Ina THEQUE
http://www.inatheque.fr/index.html
Legal
deposit
database
1 M hours of radio and TV programs/year
Ina’s collections
2000 : Ina started archiving private audiovisual resources
Centre Georges
Pompidou
Ariana films
(Afghanistan)
Newspaper
IOC – Olympic games
From :
• producers
• cultural institutions…
• individuals (directors, artists…)
• private companies
 Uses : commercial and/or patrimonial
Legal
deposit
database
Professional
archives
database
or
 Access :
• Inamediapro
• Ina THEQUE
Workflows and descriptive metadata
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
Metadata generated through different activities :
- 1/ Management of the daily inflow (metadata concerning TV and radio
programs from 120 channels) : 2 sources of metadata
 external :
• from private content providers (Plurimedia, Mediametrie,
Kantar Media…)
• from broadcasters (France Television, Radio-France….)
 internal : segmentation and content description of informative
programs (news, magazines, documentaries…) by Ina’s archivists
Cataloguing and descriptive metadata
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
Cataloguing and descriptive metadata
Data import before
broadcast : data about
programs, contents,
broadcast time forecast
Data import after
broadcast
(real programmation)
Quality control,
alignment,
enrichment and
validation of the
cataloguing datasynchronisation
by cataloguers
Day - 1
Day +1
Link to media file
Content
description,
segmentation and
indexation by
archivists
Automatic
creation of
records in the
Legal deposit
database
Records of public
TV and radio
programs
transferred to the
« professionnal
archives »
database
2/ valorization activities
 Structuration of the collections
Multimedia sets of archives about
personnalities or thematic issues
 Contextualisation, editorialisation
• Preparation of content to be put online
on the occasion of special events
Professional
archives
database
- selection, annotation, indexation
of integral archives
- creation of clips
- media file segmentation
• Creation of thematic frescoes…
3/ quality control operations
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
Cataloguing
+
Thematic description
Issues, people, places,
historical periods, events…
+
Descriptors
Issues, people, places
Different uses, different content descriptions
Cataloguing
+
Thematic description
Issues, people, places, ,
historical periods, events…
+
Analytic description
Shotlists, sounds, effects
+
Descriptors
Issues, people, places,
footage, sounds
LEGAL DEPOSIT
DATABASE
PROFESSIONAL ARCHIVES
DATABASE
27 364 000* records 8 547 000* records
* TV and radio
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
Applications and resources used for documentary tasks
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
Segmentation and annotation application
Mediascope
http://www.inatheque.fr/consultation/mediascope.html
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
Thesaurus in french, english, spanish
Concepts : 11 639
Persons : 81 003
Organizations : 7 715
Places : 45 467
Multilingual Thesaurus
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
Places
Online help application
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
Project of a new information system
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
Professionnal archivesLegal Deposit
Oracle
databases
[documentary and
material data]
Importing of
external metadata
(Plurimédia, Mediametrie,
Kantar, Edison, Lisa, Gilda,
Sierra, TF1/LCI)
Text
archives
(Gestion Document 4D)
Technical applications
(ARPP, CAPTN, etc.)
Consultation databases
(exports)
Customer
management
applications
(Workflow Radio &
TV, Gescom,
InaMédiapro,
Ina.fr)
Legal
applications
(Adaje, Aida)
Technical applications
(Batchnum, Scandir, File Registration,
Infos fichiers HSM, Info fichiers Radio)
Oracle
databases
[documentary and
material data]
Selection
and
transfer
Flowchart showing the current system : main data streams
(non-exhaustive schematic diagram)
Data use
Data creation / updating
Customer
management
applications and
databases
(Workflow Radio & TV,
Gescom, InaMédiapro,
Ina.fr)
Technical applications
(Batchnum, Scandir, File
Registration, Infos fichiers HSM, Info
fichiers Radio, Autres fonds, ARPP,
CAPTN, Ossean, etc.)
Data use
Data creation / updating
Importing of
external metadata
(Plurimédia,
Mediametrie, Kantar,
Edison, Lisa, Gilda,
Sierra, TF1/LCI)
Data lake
Access environments
(InaThèque, PCM)
Consultation and
documentary and
material metadata
processing applications
= Notilus
Legal
applications
and databases
(Adaje, Aida)
Main data streams of the target information system
(non-exhaustive schematic diagram)
Technical applications
(Batchnum, Scandir, File
Registration, Infos fichiers HSM, Info
fichiers Radio, Autres fonds, ARPP,
CAPTN, Ossean, etc.)
Importing of
external metadata
(Plurimédia,
Mediametrie, Kantar,
Edison, Lisa, Gilda,
Sierra, TF1/LCI)
Data lake
Access environments
(InaThèque, PCM)
Consultation and
documentary and
material metadata
processing applications
= Notilus
Customer
management
applications and
databases
(Workflow Radio & TV,
Gescom, InaMédiapro,
Ina.fr)
Legal
applications
and databases
(Adaje, Aida)
Main data streams of the target information system
(non-exhaustive schematic diagram)
Data use
Data creation / updating
Legal
applications
and databases
(Adaje, Aida)
Access environments
(InaThèque, PCM)
Technical applications
(Batchnum, Scandir, File
Registration, Infos fichiers HSM, Info
fichiers Radio, Autres fonds, ARPP,
CAPTN, Ossean, etc.)
Importing of
external metadata
(Plurimédia,
Mediametrie, Kantar,
Edison, Lisa, Gilda,
Sierra, TF1/LCI)
Consultation and
documentary and
material metadata
processing applications
= Notilus
Data lake
Customer
management
applications and
databases
(Workflow Radio & TV,
Gescom, InaMédiapro,
Ina.fr)
Main data streams of the target information system
(non-exhaustive schematic diagram)
Data use
Data creation / updating
- To answer the respective needs
of both documentary activities
in the same production and retrieve tool
- To allow a better interoperability with others sources or collections
- To import and describe non broadcast audiovisual resources
- To be able to adapt our documentary practices to new audiovisual
objects
The new model is currently elaborated by the project team and IT
architects.
A new data model
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
Item
Evènement
(event)
Instance
A new information system with services developped by the R&D department
- Speech to text with
- name entity extraction
- detection of quotations
- alignment text/time codes
- Connexion to external sources
(artistic works, regular events)
- Footage and sounds analysis (detection of
faces, monuments, works of art, voices…)
http://recherche.ina.fr/eng
- New multimedia player
https://ina-foss.github.io/amalia.js/acmmm2015/
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
In the respect of legal constraints, new uses of
our data could emerge :
- elaboration of a policy of open data (on
data generated at Ina) to develop
secondary uses by media specialists
New information system, new prospects
- linked data : audiovisual works on Victor Hugo linked to Victor
Hugo’s literary works archived at the Bibliothèque nationale de
France (French national Library) ?
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
The changing role of archivists
At Ina, the integration of external metadata caused many changes in
documentary procedures during the last 10 years.
New challenges to come with the big data policy and the development of
semi automatic description :
 guarantee the quality of the metadata
 ensure consistency of data in relation with uses
 use the very good knowledge of the collections :
• to structure even more than before massive amount of data
• to work at developping new enhancement practices
• to accompany the audience in its researches among more
and more data
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
More information about Ina
http://www.institut-national-audiovisuel.fr/en/home
acouteux@ina.fr
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016
Anne Couteux
Audiovisual archiving at Ina
Globo/IFTA seminar, Brazil, may 2016

Anne couteux - Audiovisual archiving at Ina

  • 1.
    Audiovisual archiving atIna Globo/IFTA seminar, Brazil, may 2016 Anne Couteux - Ina
  • 2.
    The French NationalAudiovisual Institute, missions and audiovisual collections Audiovisual archiving at Ina Globo/IFTA seminar, Brazil, may 2016
  • 3.
    Ina’s missions 1975 :creation of Ina, french National Audiovisual Institute Missions • Preservation of the audiovisual heritage of ORTF (French radio and television French Broadcasting Authority, 1964-1974) • Training, R&D • Production, audiovisual creation
  • 4.
    127 million eurosbudget in 2016 972 employees (2015) 14 700 000 hours of TV and radio documents 1 200 000 preserved photographs 123 television channels and radio stations picked up under legal deposit 1 232 096 hours of online archives available on inamediapro.com 3 000 professionals trained each year 156,7 millions views on ina.fr Bry sur Marne Paris Lille Rennes Strasbourg Lyon MarseilleToulouse Audiovisual archiving at Ina Globo/IFTA seminar, Brazil, may 2016
  • 5.
    1986 - Inais entitled to sell archives for new audiovisual productions : Ina’s collections • Public TV programs (from 1949) • Public Radio programs (from 1902)  Target audience : directors, journalists, institutions… needing footage  Use : commercial  Access : Inamediapro https://www.inamediapro.com/ Ina.fr (selection of media free of rights) http://www.ina.fr The audiovisual heritage of ORTF (Office de radio et télévision française) Professional archives database 1,2 M hours of radio and TV programs
  • 6.
    Ina’s collections 1995 :application of a 1992 act relative to the legal deposit of french TV and radio programs (public and private) • 7 TV channels in 1995 > 100 channels in 2016 • 6 radio channels in 1995 > 20 channels in 2016 (soon 64)  Target audience : educational, academic population  Use : patrimonial, scientific (researches, studies)  Access : Ina THEQUE http://www.inatheque.fr/index.html Legal deposit database 1 M hours of radio and TV programs/year
  • 7.
    Ina’s collections 2000 :Ina started archiving private audiovisual resources Centre Georges Pompidou Ariana films (Afghanistan) Newspaper IOC – Olympic games From : • producers • cultural institutions… • individuals (directors, artists…) • private companies  Uses : commercial and/or patrimonial Legal deposit database Professional archives database or  Access : • Inamediapro • Ina THEQUE
  • 8.
    Workflows and descriptivemetadata Audiovisual archiving at Ina Globo/IFTA seminar, Brazil, may 2016
  • 9.
    Metadata generated throughdifferent activities : - 1/ Management of the daily inflow (metadata concerning TV and radio programs from 120 channels) : 2 sources of metadata  external : • from private content providers (Plurimedia, Mediametrie, Kantar Media…) • from broadcasters (France Television, Radio-France….)  internal : segmentation and content description of informative programs (news, magazines, documentaries…) by Ina’s archivists Cataloguing and descriptive metadata Audiovisual archiving at Ina Globo/IFTA seminar, Brazil, may 2016
  • 10.
    Cataloguing and descriptivemetadata Data import before broadcast : data about programs, contents, broadcast time forecast Data import after broadcast (real programmation) Quality control, alignment, enrichment and validation of the cataloguing datasynchronisation by cataloguers Day - 1 Day +1 Link to media file Content description, segmentation and indexation by archivists Automatic creation of records in the Legal deposit database Records of public TV and radio programs transferred to the « professionnal archives » database
  • 11.
    2/ valorization activities Structuration of the collections Multimedia sets of archives about personnalities or thematic issues  Contextualisation, editorialisation • Preparation of content to be put online on the occasion of special events Professional archives database - selection, annotation, indexation of integral archives - creation of clips - media file segmentation • Creation of thematic frescoes… 3/ quality control operations Audiovisual archiving at Ina Globo/IFTA seminar, Brazil, may 2016
  • 12.
    Cataloguing + Thematic description Issues, people,places, historical periods, events… + Descriptors Issues, people, places Different uses, different content descriptions Cataloguing + Thematic description Issues, people, places, , historical periods, events… + Analytic description Shotlists, sounds, effects + Descriptors Issues, people, places, footage, sounds LEGAL DEPOSIT DATABASE PROFESSIONAL ARCHIVES DATABASE 27 364 000* records 8 547 000* records * TV and radio Audiovisual archiving at Ina Globo/IFTA seminar, Brazil, may 2016
  • 13.
    Applications and resourcesused for documentary tasks Audiovisual archiving at Ina Globo/IFTA seminar, Brazil, may 2016
  • 14.
    Segmentation and annotationapplication Mediascope http://www.inatheque.fr/consultation/mediascope.html Audiovisual archiving at Ina Globo/IFTA seminar, Brazil, may 2016
  • 15.
    Thesaurus in french,english, spanish Concepts : 11 639 Persons : 81 003 Organizations : 7 715 Places : 45 467
  • 16.
    Multilingual Thesaurus Audiovisual archivingat Ina Globo/IFTA seminar, Brazil, may 2016 Places
  • 17.
    Online help application Audiovisualarchiving at Ina Globo/IFTA seminar, Brazil, may 2016
  • 18.
    Project of anew information system Audiovisual archiving at Ina Globo/IFTA seminar, Brazil, may 2016
  • 19.
    Professionnal archivesLegal Deposit Oracle databases [documentaryand material data] Importing of external metadata (Plurimédia, Mediametrie, Kantar, Edison, Lisa, Gilda, Sierra, TF1/LCI) Text archives (Gestion Document 4D) Technical applications (ARPP, CAPTN, etc.) Consultation databases (exports) Customer management applications (Workflow Radio & TV, Gescom, InaMédiapro, Ina.fr) Legal applications (Adaje, Aida) Technical applications (Batchnum, Scandir, File Registration, Infos fichiers HSM, Info fichiers Radio) Oracle databases [documentary and material data] Selection and transfer Flowchart showing the current system : main data streams (non-exhaustive schematic diagram) Data use Data creation / updating
  • 20.
    Customer management applications and databases (Workflow Radio& TV, Gescom, InaMédiapro, Ina.fr) Technical applications (Batchnum, Scandir, File Registration, Infos fichiers HSM, Info fichiers Radio, Autres fonds, ARPP, CAPTN, Ossean, etc.) Data use Data creation / updating Importing of external metadata (Plurimédia, Mediametrie, Kantar, Edison, Lisa, Gilda, Sierra, TF1/LCI) Data lake Access environments (InaThèque, PCM) Consultation and documentary and material metadata processing applications = Notilus Legal applications and databases (Adaje, Aida) Main data streams of the target information system (non-exhaustive schematic diagram)
  • 21.
    Technical applications (Batchnum, Scandir,File Registration, Infos fichiers HSM, Info fichiers Radio, Autres fonds, ARPP, CAPTN, Ossean, etc.) Importing of external metadata (Plurimédia, Mediametrie, Kantar, Edison, Lisa, Gilda, Sierra, TF1/LCI) Data lake Access environments (InaThèque, PCM) Consultation and documentary and material metadata processing applications = Notilus Customer management applications and databases (Workflow Radio & TV, Gescom, InaMédiapro, Ina.fr) Legal applications and databases (Adaje, Aida) Main data streams of the target information system (non-exhaustive schematic diagram) Data use Data creation / updating
  • 22.
    Legal applications and databases (Adaje, Aida) Accessenvironments (InaThèque, PCM) Technical applications (Batchnum, Scandir, File Registration, Infos fichiers HSM, Info fichiers Radio, Autres fonds, ARPP, CAPTN, Ossean, etc.) Importing of external metadata (Plurimédia, Mediametrie, Kantar, Edison, Lisa, Gilda, Sierra, TF1/LCI) Consultation and documentary and material metadata processing applications = Notilus Data lake Customer management applications and databases (Workflow Radio & TV, Gescom, InaMédiapro, Ina.fr) Main data streams of the target information system (non-exhaustive schematic diagram) Data use Data creation / updating
  • 23.
    - To answerthe respective needs of both documentary activities in the same production and retrieve tool - To allow a better interoperability with others sources or collections - To import and describe non broadcast audiovisual resources - To be able to adapt our documentary practices to new audiovisual objects The new model is currently elaborated by the project team and IT architects. A new data model Audiovisual archiving at Ina Globo/IFTA seminar, Brazil, may 2016
  • 24.
  • 25.
    A new informationsystem with services developped by the R&D department - Speech to text with - name entity extraction - detection of quotations - alignment text/time codes - Connexion to external sources (artistic works, regular events) - Footage and sounds analysis (detection of faces, monuments, works of art, voices…) http://recherche.ina.fr/eng - New multimedia player https://ina-foss.github.io/amalia.js/acmmm2015/ Audiovisual archiving at Ina Globo/IFTA seminar, Brazil, may 2016
  • 26.
    In the respectof legal constraints, new uses of our data could emerge : - elaboration of a policy of open data (on data generated at Ina) to develop secondary uses by media specialists New information system, new prospects - linked data : audiovisual works on Victor Hugo linked to Victor Hugo’s literary works archived at the Bibliothèque nationale de France (French national Library) ? Audiovisual archiving at Ina Globo/IFTA seminar, Brazil, may 2016
  • 27.
    The changing roleof archivists At Ina, the integration of external metadata caused many changes in documentary procedures during the last 10 years. New challenges to come with the big data policy and the development of semi automatic description :  guarantee the quality of the metadata  ensure consistency of data in relation with uses  use the very good knowledge of the collections : • to structure even more than before massive amount of data • to work at developping new enhancement practices • to accompany the audience in its researches among more and more data Audiovisual archiving at Ina Globo/IFTA seminar, Brazil, may 2016
  • 28.
    More information aboutIna http://www.institut-national-audiovisuel.fr/en/home acouteux@ina.fr Audiovisual archiving at Ina Globo/IFTA seminar, Brazil, may 2016 Anne Couteux
  • 29.
    Audiovisual archiving atIna Globo/IFTA seminar, Brazil, may 2016

Editor's Notes

  • #2 In troduction : project manager in documentary engineering at Ina. Sommaire. I am going to present : Ina’s missions and collections Workflows and descriptive metadata IT applications and ressources used for documentary work The second part of this presentation will concern the multi-annual project of a new system of information the new data model in a few words new web services And as a conclusion a few words on future activities of the professional of information at Ina
  • #4 Creation of Ina in 1975 when ORTF (the French radio and television broadcasting authority, 1964-1974) was wound up   It is a public establishment of an industrial and commercial nature (EPIC) It was set up to collect and conserve French audiovisual resources, basically for production purposes of national broadcasters Besides, Ina was also positioned as a training, research and creative production organisation when it was founded – I will not present these activities here.
  • #5 Creation of Ina in 1975 when ORTF (the French radio and television broadcasting authority, 1964-1974) was wound up   It is a public establishment of an industrial and commercial nature (EPIC) It was set up to collect and conserve French audiovisual resources, basically for production purposes of national broadcasters Besides, Ina was also positioned as a training, research and creative production organisation when it was founded – I will not present these activities here.
  • #6 There are 3 types of audiovisual resources at Ina. The « historical » one is concerning the programs broadcast on public TV and radio (dating from the beginning of TV and radio broadcast until today). In 1986, Ina is entitled to sell archives for new audiovisual productions (rights are transferred to Ina 1 year after broadcast (6 months today). That’s why this collection is commonly called « Professional archives ». They are meant to be sold for audiovisual productions. The collections are described in a specific database.   Audiences 1/ These audiovisual resources are dedicated to any producer, director, journalist… looking for footage for new audiovisual works. Use : commercial Access : watching, selection and purchase of extracts are available on the online access interface inamediapro after registration. OUVRIR INAMEDIAPRO (AFE85000552 ou CPF04007161) The services (archiving, preserving and marketing) associated with these resources not only require Ina’s archiving skills, but also legal analysis of intellectual property rights, so as to enable marketing and royalty payments (activity managed by the legal department)   2/ The « general » public can also have access to these archive on ina.fr. OUVRIR INA.FR A selection of content in the public domain (media free of rights), freely distributed online without breaching intellectual property rights   On this site the free distribution of extracts or full content is governed by an editorial selection policy. Content distributed in this way must be related to a news item (death of a personality, anniversary of a historic event, etc.)   Daily updating according to events happening in cultural, political, sports matters.
  • #7 In 1992, inspired by the legal deposit related to different forms of printed material since the 16th century, the legislature extended it to french radio and television broadcasts, treated as forms of publication. Ina was chosen to take this new mission in charge.   The criteria is that all programmes produced or co-produced by French national broadcasters are relevant for legal deposit archives.   Law voted in 1992 but became effective only in 1995. In 1995 : 7 TV and 6 radio channels were archived by Ina. Initially, collection was based on physical recordings by broadcasters but the system changed in 2001 when Ina began to record direct digital streams from stations and channels. It ended Ina’s dependency on broadcasters in terms of content supply, but it changed the very principle of collection for deposit : it went from selection at source based on statutory criteria to the comprehensive recording of broadcast content 24 hours a day. The perimeter of collection was gradually extended and today it takes in 100 television channels and 20 (soon 64) radio stations that are recorded continuously, generating 1 million hours of new content every year.   This spectacular increase influenced the way metadata were going to be implemented as we will see further.   The audience for this audiovisual resources is mainly composed of teachers and students for scientific researches on the mediatisation of specific topics (how the immigration issue is treated on TV series for instance) or on media themselves (the evolution of political debates on TV since the 70’s for instance).   Uses : this second type of collection is based on a patrimonial purpose. non commercial archives and no copy allowed (1992 acts).   Access : inatheque.fr where you can find the catalogue but no media is available online. OUVRIR INATHEQUE The media can be watched or listened at the National Library (Bibliothèque nationale de France where Ina share a research reception adedicated to audiovisual collections.
  • #8 To widen its panel of audiovisual content Ina started to sign agreements with private partners 15 years ago. These private collections, about 400, come from cutural institutions (theatres, operas…) but also from private companies like news agency… or individuals (directors, TV producers…). This activity is constantly developping and, depending on the agreement between Ina and the partner, the aime is either preservation or communication of the archives. According to the final use, records are stored in the professional database and available on inamediapro or in the legal deposit database and available on inatheque.fr.  
  • #9 All the collections, whatever the audience or the uses are searchable on the basis of cataloguing and descriptive metadata produced by Ina’s cataloguers and documentalists.   The following slides will describe the main lines of the workflows of creation of descriptive metadata
  • #10 These metadata are generated from several activities. The main one is from the management of the daily inflow for which there are 2 sources : In the last 10 years, the constant increase of channels collected in the framework of legal deposit activities led us to acquire metadata from external sources, private content providers. Plurimedia = a « news agency » specialised in the providing of information about TV and radio programs and more largely cinema, cultural events and leisure. Provides TV and radio program editors. Mediametrie = company specialized in audience measurement and marketing studies of audiovisual and interactive media in France. Kantar media = society dedicated to the analysis of media content About 80% of the metadata on the legal deposit collections come from external sources They are mainly managed by the team of cataloguers The other source of metadata is internal : it comes froms documentary processing with indexing, sequencing and the production of a summary.
  • #11 The process is the following : The day before boradcast we get data from Plurimedia. The day after we get data from mediametrie : as it reflects the real broadcasting we cross the information with the previous one, we synchronise information Once it is done a record is automatically created in the legal deposit database After this first stage of processing there isa cataloguing control consisting mainly in the alignment of the imported data with internal references At the same time, a link is made with the appropriate media file. + more detailed documentary processing with indexing and the production of a summary, The data is available online on Inatheque website The data concerning public TV and radio programs is then transferred to the professional database.
  • #12 This database is devoted to professional distribution services, which require specific processing on the data. The activities of structuration and contextualisation also generate descriptive metadata : through selection and description of full archive and also creation, annotation and indexation of many extracts enhancing remarquable footage The structuration of the collections consists in elaborating multimedia folders related to personnalities (politicians, artists, athletes…) or thematic issues (environment, health, economy…). The objectives are : - to facilitate rapid and relevant access to the searched archive in the large amount of data. - to offer ready to use archive - to show archives that would not be found by the clients (quotations) - to help the journalists not to use the same archive all the time… MONTRER LA FRESQUE Festival de Cannes   Quality control : concerns more specifically collections which are not at the level of description considering Ina’s standard. Content description, segmentation, indexations, links to appropriate media files
  • #13 The scientific uses on one side and the sale of footage on the other have incidences on the choices made to describe our collections : researchers, teachers and students mainly need a thematic description whereas the inamediapro clients need the description and indexation of footage.   2 uses, 2 databases, 2 types of description.   Exemple DA/DL CPC97101061(inamediapro) Brésil, le carnaval des enfants Faut pas rêver
  • #15 Mediascope  Used for the annotation and sequencing of media content in parallel with viewing. It also enables the capture of thumbnails that are themselves timecoded and indexed for both quantitative and qualitative research   Used by cataloguers, documentalists but also by researchers at Ina Thèque.   Application available online (can be downloaded)
  • #16 This database is devoted to professional distribution services, which require specific editorial processing– it is facilitated by the first documentary stages. This processing pattern (shown in figure 1) does not prevent data redundancy between the databases (since each one is connected with its own processing applications and dedicated consultation interfaces) or the risk of divergent changes in an originally shared piece of information since the two databases are not systematically co-synchronised. Even so, this model has the advantage of establishing communication between the databases and throwing light on the shared information they contain, a necessary preparation and even source of awareness prior to the redesign of these applications in a single shared environment.
  • #17 Thesaurus : common vocabulary to all uses and teams but implemented in each database which means a double management ! 4 languages : persons, organizations, places and concepts. Translated in 3 languages : english, spanish and arabic
  • #18 Application which concentrate many detailed instructions and rules used for archive description. As the number of archivists and cataloguers is significant (about 180), we needed to give definitions and organize information so that the audiovisual collection would descripted in a coherent way.   How to write titles How to describe sports collections How to write the name of a person, of a musical group… How to distinguish report and documentary ? Gives very detailed instructions for documentary processing so that the teams (cataloguers and archivists) apply the same rules for the same type of collection.
  • #19   Working in silos as opposed to working accross processes  Kind of "silo" management approach
  • #20 As you understood, over the last 20 years, Ina’s collections have been managed by two parallel systems designed to operate independently, each one having its own processing applications and dedicated consultation interfaces.   Each one plays a specific role: - In one system, it is the distribution of a relatively limited volume of content for commercial and professional purpose (the sale of extracts mainly) for the other one, it is the management of an ever-growing volume of content for uses that enable research into trends applied to thousands of data sets (favorisant notamment les études de tendances appliquées à des milliers voire des dizaines de milliers de données)   However, links have gradually been established between these two systems of collection and two types of uses.   But there are still two problems with this processing pattern :   1/ data redundancy between the databases. Some content can fall into both categories – for instance, public television news, covered by both legal deposit and the professional archives (according to the terms of the service agreement signed between Ina and national public broadcasters).   2/ risk of divergent changes on the same record since the two databases are not systematically co-synchronised   Because of this fragmentation in the data management, the documentary processes are complex and not fluid, the metadata collected or created at Ina, especially in the frame of the legal deposit, are under exploited.   It was becoming necessary to redesign of these applications in a single shared environment.  
  • #21 This is why, since 2014, the Institute has been remodeling its documentary IT, in close coordination with the broader construction of a "data lake", which will allow the merging of metadata from all enterprise applications (documentary, legal, commercial).   The adoption of a big data policy will help to develop new ways of using and linking metadata for internal needs (metadata linked to legal or commercial information for instance) first and perhaps later to other collections.   This rationalization of the metadata management should also help building their opening policy, according to legal criteria to be determined (such as their origin).
  • #22 Alongside, a new application is also elaborated to allow the processus of ingest of third-party source via a single, systematic gateway. It concerns any descriptive or technical metadata related to any type of carriers or digital files.   This brick will contain mapping and alignment components will guarantee the conformity with Ina’s new standards.
  • #23 Finally, a new documentary system is conceived :   to streamline (rationaliser) the processing chain around a common tool to rationalize medatada production = rationnaliser la production de métadonnées (no more duplications) to guarantee a better documentary quality : a specific brick will deal with quality work on massive data
  • #24 To answer the respective needs of both documentary activities in the same production and retrieve tool > more flexibility in the description To allow better interoperability with others sources or collections > in the future for legal reasons mainly to be able to adapt our description to new audiovisual objects or practice > which could appear in the future (filmed radio, cross media works, television online), we needed to adopt a new data model. It is inpired from FRBR for there is no work but 3 main entities : instance, event item with different typologies for each of them. Type of instances : unique program (documentary), episode, subject of the evening news… Type of item : original film, copy, digital file… Types of event : production, broadcast, shooting, recording, publication…   Each entity will have its own attributes : annotation (text, descriptors…), activities…
  • #25 This is the new architecture in its simplest expression. Instance, event and item will be the skeletton of the data model and according to the media described (TV, radio, Photo, text…), whether it is broadcast or not, according to the level of description… we will be able to manage the continuity of our actual process in a much more flexible way. For instance, in the legal deposit database there was one record for each braoadcast of a program (and many programs are broadcast several time on different channels and formats) > in the new system there will be one instance and as many broadcast event as broadcast)
  • #26 In the new system, new applications will be implemented. They have been developped by IT department fo many years and are accessible online. They concern Speech to text : to help documentalists in the work of description to enrich the description of radio collections Connexion to external sources to adopt standards from other authorities and facilitate connexion in the future if objects described in a homogeneous way New player : Amalia.js New functionalities compared to mediascope and other players : better sequencing and footage timecoding   The challenge is, in the future data lake, the management of massive volumes of data.  
  • #27 A policy of open data would be an opportunity to valorize our contents and to open the way to new explorations of the data. But the legal frame is not yet determined to protect private life of people appearing in the media. Their identity, which is private, mentionned in the record would thus be accessible to anyone. This is not possible today in the actual legal frame. If our data where open, new collaborations could be imagined with partners like cultural or educational institutions to elaborate
  • #30 To answer the respective needs of both documentary activities in the same production and retrieve tool > more flexibility To allow better interoperability with others sources or collections to be able to adapt our description to new audiovisual objects or practice > which could appear in the future (radio video recorded or any cross media work, television online), we needed to adopt a new data model. It is inpired from FRBR (Functional Requirements for Bibliographic Records) but not quite the same for there is no work