SlideShare a Scribd company logo
Replicating Linguistic Resources
B2SAFE: MPI-TLA CLARIN Center
Willem Elbers (MPI-TLA)
2nd EUDAT Conference

Date: 29 October 2013
The Language Archive
• Data on languages:
–
–
–
–
–
–
–

about 60 Terabyte of well-described resources
about 20.000 hours of digitized audio/video recordings
about 73.000 metadata described sessions
about 4.5 million annotated segments
data on more than 200 languages
among these, data from about 60 DOBES teams
acquisition, speech, multimodal, multilingual, language and cognition,
brain imaging, ethnological and other data.

• Mission:
– Maintaining access to all stored resources for the current generation of
researchers, language communities and the interested public.
– Preserve the valuable cultural heritage for current en future generations.
2
The Language Archive

3
B2SAFE
• Goals
– Replication of data
• B2SAFE!

– Replication of services
• RZG providing Language Archive Technology services at
replica side

• B2SAFE Community extensions:
– Replication based on logical structure defined in the IMDI/CMDI
metadata
– Integrated with underlying SAM-FS
4
5
Approx: 80TB

6
Approx: 80TB

7
Approx: 80TB

> 3TB

> 3TB

8
Approx: 80TB

> 3TB

> 3TB

9
Summary

“Cultural Heritage Data replicated for the future”

• Data replication running in production
• LAT Software stack running @ RZG (beta)
• Replication of authorization records running (beta)
10
Questions

Thank you for your attention

More Related Content

Similar to Eudat 2nd conference - CLARIN B2SAFE demo

Leslie Johnston: Challenges of Preserving Every Digital Format, 2012
Leslie Johnston: Challenges of Preserving Every Digital Format, 2012Leslie Johnston: Challenges of Preserving Every Digital Format, 2012
Leslie Johnston: Challenges of Preserving Every Digital Format, 2012lljohnston
 
Jordan conference presentation
Jordan conference  presentationJordan conference  presentation
Jordan conference presentationfsleibi
 
abstract: hierarchical storage & managing media assets for the libraries ...
abstract: hierarchical storage & managing media assets for the libraries ...abstract: hierarchical storage & managing media assets for the libraries ...
abstract: hierarchical storage & managing media assets for the libraries ...FIAT/IFTA
 
Digitization Of Audiovisual Collections
Digitization Of Audiovisual CollectionsDigitization Of Audiovisual Collections
Digitization Of Audiovisual CollectionsBogdan Trifunovic
 
Keeping the Broadcast Historic Record: An Archive of Public Media in the Making
Keeping the Broadcast Historic Record: An Archive of Public Media in the MakingKeeping the Broadcast Historic Record: An Archive of Public Media in the Making
Keeping the Broadcast Historic Record: An Archive of Public Media in the MakingWGBH Media Library and Archives
 
Towards more smart, connected and open audiovisual archives
Towards more smart, connected and open audiovisual archivesTowards more smart, connected and open audiovisual archives
Towards more smart, connected and open audiovisual archivesJohan Oomen
 
Olaf Janssen on the principles of large-scale digital libraries and their app...
Olaf Janssen on the principles of large-scale digital libraries and their app...Olaf Janssen on the principles of large-scale digital libraries and their app...
Olaf Janssen on the principles of large-scale digital libraries and their app...Olaf Janssen
 
Tools & Technologies for Enhancing Access to Audiovisual - the Singapore Journey
Tools & Technologies for Enhancing Access to Audiovisual - the Singapore JourneyTools & Technologies for Enhancing Access to Audiovisual - the Singapore Journey
Tools & Technologies for Enhancing Access to Audiovisual - the Singapore JourneySound and Vision R&D
 
greenstone-bbla seminar july 2010-cheyrl
greenstone-bbla seminar july 2010-cheyrlgreenstone-bbla seminar july 2010-cheyrl
greenstone-bbla seminar july 2010-cheyrlCheryl Tanicala-Roldan
 
Extending the Reach of Southern Audiovisual Sources
Extending the Reach of Southern Audiovisual SourcesExtending the Reach of Southern Audiovisual Sources
Extending the Reach of Southern Audiovisual Sourcesekemeyer
 
Open Images: Open Video and the Audiovisual Archive
Open Images: Open Video and the Audiovisual ArchiveOpen Images: Open Video and the Audiovisual Archive
Open Images: Open Video and the Audiovisual Archivemaartenbrinkerink
 
Laura Welcher - The Rosetta Project and The Language Commons
Laura Welcher - The Rosetta Project and The Language CommonsLaura Welcher - The Rosetta Project and The Language Commons
Laura Welcher - The Rosetta Project and The Language Commonslongnow
 
CROMWELL MERTEN Case studies and contrasts in preservation for cultural organ...
CROMWELL MERTEN Case studies and contrasts in preservation for cultural organ...CROMWELL MERTEN Case studies and contrasts in preservation for cultural organ...
CROMWELL MERTEN Case studies and contrasts in preservation for cultural organ...FIAT/IFTA
 
Kaliszewska rights management in fina's digital archive
Kaliszewska rights management in fina's digital archiveKaliszewska rights management in fina's digital archive
Kaliszewska rights management in fina's digital archiveFIAT/IFTA
 
Digitisaing oral history by Sally Hone
Digitisaing oral history by Sally HoneDigitisaing oral history by Sally Hone
Digitisaing oral history by Sally HonePublicLibraryServices
 
INFORMATION TECHNOLOGY IN INFORMATION AGENCIES (IMD257 / IMD204)
INFORMATION TECHNOLOGY IN INFORMATION AGENCIES (IMD257 / IMD204)INFORMATION TECHNOLOGY IN INFORMATION AGENCIES (IMD257 / IMD204)
INFORMATION TECHNOLOGY IN INFORMATION AGENCIES (IMD257 / IMD204)Kumprinx Amin
 

Similar to Eudat 2nd conference - CLARIN B2SAFE demo (20)

Leslie Johnston: Challenges of Preserving Every Digital Format, 2012
Leslie Johnston: Challenges of Preserving Every Digital Format, 2012Leslie Johnston: Challenges of Preserving Every Digital Format, 2012
Leslie Johnston: Challenges of Preserving Every Digital Format, 2012
 
Jordan conference presentation
Jordan conference  presentationJordan conference  presentation
Jordan conference presentation
 
abstract: hierarchical storage & managing media assets for the libraries ...
abstract: hierarchical storage & managing media assets for the libraries ...abstract: hierarchical storage & managing media assets for the libraries ...
abstract: hierarchical storage & managing media assets for the libraries ...
 
Digitization Of Audiovisual Collections
Digitization Of Audiovisual CollectionsDigitization Of Audiovisual Collections
Digitization Of Audiovisual Collections
 
Towards a Common Approach for Access to Digital Archival Records in Europe. A...
Towards a Common Approach for Access to Digital Archival Records in Europe. A...Towards a Common Approach for Access to Digital Archival Records in Europe. A...
Towards a Common Approach for Access to Digital Archival Records in Europe. A...
 
Ibac2007 riede
Ibac2007 riedeIbac2007 riede
Ibac2007 riede
 
Keeping the Broadcast Historic Record: An Archive of Public Media in the Making
Keeping the Broadcast Historic Record: An Archive of Public Media in the MakingKeeping the Broadcast Historic Record: An Archive of Public Media in the Making
Keeping the Broadcast Historic Record: An Archive of Public Media in the Making
 
Towards more smart, connected and open audiovisual archives
Towards more smart, connected and open audiovisual archivesTowards more smart, connected and open audiovisual archives
Towards more smart, connected and open audiovisual archives
 
Olaf Janssen on the principles of large-scale digital libraries and their app...
Olaf Janssen on the principles of large-scale digital libraries and their app...Olaf Janssen on the principles of large-scale digital libraries and their app...
Olaf Janssen on the principles of large-scale digital libraries and their app...
 
Tools & Technologies for Enhancing Access to Audiovisual - the Singapore Journey
Tools & Technologies for Enhancing Access to Audiovisual - the Singapore JourneyTools & Technologies for Enhancing Access to Audiovisual - the Singapore Journey
Tools & Technologies for Enhancing Access to Audiovisual - the Singapore Journey
 
greenstone-bbla seminar july 2010-cheyrl
greenstone-bbla seminar july 2010-cheyrlgreenstone-bbla seminar july 2010-cheyrl
greenstone-bbla seminar july 2010-cheyrl
 
Extending the Reach of Southern Audiovisual Sources
Extending the Reach of Southern Audiovisual SourcesExtending the Reach of Southern Audiovisual Sources
Extending the Reach of Southern Audiovisual Sources
 
E-ARK: Open Data Mining for Government Archives
E-ARK: Open Data Mining for Government ArchivesE-ARK: Open Data Mining for Government Archives
E-ARK: Open Data Mining for Government Archives
 
Open Images: Open Video and the Audiovisual Archive
Open Images: Open Video and the Audiovisual ArchiveOpen Images: Open Video and the Audiovisual Archive
Open Images: Open Video and the Audiovisual Archive
 
Introduction to Digital libraries
Introduction to Digital librariesIntroduction to Digital libraries
Introduction to Digital libraries
 
Laura Welcher - The Rosetta Project and The Language Commons
Laura Welcher - The Rosetta Project and The Language CommonsLaura Welcher - The Rosetta Project and The Language Commons
Laura Welcher - The Rosetta Project and The Language Commons
 
CROMWELL MERTEN Case studies and contrasts in preservation for cultural organ...
CROMWELL MERTEN Case studies and contrasts in preservation for cultural organ...CROMWELL MERTEN Case studies and contrasts in preservation for cultural organ...
CROMWELL MERTEN Case studies and contrasts in preservation for cultural organ...
 
Kaliszewska rights management in fina's digital archive
Kaliszewska rights management in fina's digital archiveKaliszewska rights management in fina's digital archive
Kaliszewska rights management in fina's digital archive
 
Digitisaing oral history by Sally Hone
Digitisaing oral history by Sally HoneDigitisaing oral history by Sally Hone
Digitisaing oral history by Sally Hone
 
INFORMATION TECHNOLOGY IN INFORMATION AGENCIES (IMD257 / IMD204)
INFORMATION TECHNOLOGY IN INFORMATION AGENCIES (IMD257 / IMD204)INFORMATION TECHNOLOGY IN INFORMATION AGENCIES (IMD257 / IMD204)
INFORMATION TECHNOLOGY IN INFORMATION AGENCIES (IMD257 / IMD204)
 

Recently uploaded

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...Product School
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekCzechDreamin
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesThousandEyes
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Product School
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...Elena Simperl
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Alison B. Lowndes
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka DoktorováCzechDreamin
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...CzechDreamin
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...Product School
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...Product School
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeCzechDreamin
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityScyllaDB
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomCzechDreamin
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...Sri Ambati
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxDavid Michel
 

Recently uploaded (20)

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 

Eudat 2nd conference - CLARIN B2SAFE demo

  • 1. Replicating Linguistic Resources B2SAFE: MPI-TLA CLARIN Center Willem Elbers (MPI-TLA) 2nd EUDAT Conference Date: 29 October 2013
  • 2. The Language Archive • Data on languages: – – – – – – – about 60 Terabyte of well-described resources about 20.000 hours of digitized audio/video recordings about 73.000 metadata described sessions about 4.5 million annotated segments data on more than 200 languages among these, data from about 60 DOBES teams acquisition, speech, multimodal, multilingual, language and cognition, brain imaging, ethnological and other data. • Mission: – Maintaining access to all stored resources for the current generation of researchers, language communities and the interested public. – Preserve the valuable cultural heritage for current en future generations. 2
  • 4. B2SAFE • Goals – Replication of data • B2SAFE! – Replication of services • RZG providing Language Archive Technology services at replica side • B2SAFE Community extensions: – Replication based on logical structure defined in the IMDI/CMDI metadata – Integrated with underlying SAM-FS 4
  • 5. 5
  • 10. Summary “Cultural Heritage Data replicated for the future” • Data replication running in production • LAT Software stack running @ RZG (beta) • Replication of authorization records running (beta) 10
  • 11. Questions Thank you for your attention

Editor's Notes

  1. Provide access to replicated data as well Installed community services at RZG EUDAT data centerData and metadata (both stored in files), different PIDs
  2. Todo: handles
  3. Todo: handles
  4. Todo: handles
  5. Todo: handles
  6. Todo: handles