Towards a new research infrastructure for the arts and humanities Peter Doorn Director, Data Archiving and Networked Services Athens May 6 th , 2009
What do humanities scholars have to do with digital research infrastructures? Traditional image of the humanities scholar:  a loner in his study
Ideas about the future may be false... Mouse? How Rand Corporation envisioned the  future (2004) home computer in 1954
From Humanities computing to e-humanities Roots go back to the 1960s: text analysis, e.g. bible studies quantitative social and economic history linguistics archaeology E-humanities as analogy of e-science: ‘ science increasingly done through distributed global collaborations enabled by the Internet, using very large data collections, large-scale computing resources and high performance visualisation.’
Humanities computing
Aetolian Studies Project: settlement history from prehistory to modern times
CLIWOC-project (Climate of the World Oceans) Collaboration of historians and meteorologists
Journal entry, 26-29 September 1758
Shipping Routes 1750-1850 Courtesy of CLIWOC project, KNMI
Average yearly temperatures, 1750-1850 Courtesy of CLIWOC project, KNMI
Wind direction and speed Courtesy of CLIWOC project, KNMI
Wind directions and rainy days, Atlantic, 1770-1780 Courtesy of CLIWOC project, KNMI
Archaeology
Data collection in the field
Databases of finds
Photos, GIS, sherds
Virtual archive of finds, publications, data and documentation
Electronic Depot Netherlands Archaeology (EDNA)
Archaeological Data Service (UK)
POLEMON Greek National Monuments Record System Main characteristics Geographical distribution – federated architecture Administrative documentation for site monuments and moveable objects Cartographic documentation (GIS component) Connection with thematic databases and term thesauri Operation modes :  distributed, centralized, mixed Easy installation and startup By: Panos Constantopoulos Installation and deployment on a national scale  ~ 65 installations in Greece ~ 120,000 monuments recorded Trained personnel in every ephorate .. ΥΠΠΟ/ΔΑΜΔ . .
Digital Research data in the arts and humanities Digital materials collected for research purposes, e.g.: History: digitized archival sources such as population registers, shipping journals, historical censuses, judicial verdicts, medieval manuscripts Archaeology: excavation data - field reports, databases of finds, photos of objects, digital maps of sites, drawings of shards Linguistics: speech data, text corpora, video
What does data look like? Multitude of forms: data bases, spreadsheets, texts, audio, video, still images Multitude of formats: since 1960s! From home-grown applications (legacy data) via  standard software to open standards  Data is often coded or “enriched”: cannot be understood or used without ample documentation Often: difficult to use without specific software
Research Infrastructures (R.I.) R.I. in general:  permanent  and  physical R.I. for the natural sciences: ice breakers for polar research, satellites, telescopes, particle accelerators, laboratories R.I. for the humanities? Cultural heritage in all forms is the main source of humanities research Libraries and archives are the traditional “laboratories” for the humanities In the digital age, essential for innovative humanities research is: Access to digitised heritage data (data bases, text corpora, speech, image collections, etc.) Tools to process this information The most important new research infrastructure for the humanities is therefore a digital one
European infrastructure challenges In spite of some achievements, existing infrastructures are primarily national... if they are there at all! European activities are until now funded on a project basis and carried out as voluntary activities by national partners Stable, pan-European data infrastructures for the humanities hardly exist Increasing internationalisation of humanities research puts new requirements for such infrastructures
ESFRI Roadmap ESFRI = European Strategy Forum on Research Infrastructures First Roadmap launched in autumn 2006, update in 2008 About 30 proposals for large scale research facilities Six proposals are in area of social sciences and humanities
The DARIAH idea The Grand Vision:  Provide Access to European humanities and cultural heritage information across time A Research Infrastructure that can Coordinate, Catalyse, Enhance, Support the digital humanities Digital research infrastructure for the humanities: Provide permanent access to data collected/digitised in European projects: providing continuity for discontinuous activities Support research networks in the digital humanities Structure: a strong nucleus in a cluster of networked organisations and satellites
Science Case Changing research practice in a networked environment: Data (including text, images, and other media) is the laboratory of the scholar in the humanities Resources on the web are distributed (data grid) The scale of research goes up: networked projects  New technologies and methods of analysis However, European projects have no continuity The existing structures are too weak (ad hoc networks, no permanence) and national in scope Answer: strong European data infrastructure
A digital research infrastructure for the humanities is comparable to a virtual astrophysics observatory
 
 
Outline of tasks of DARIAH  Digitise – Curate – Preserve Standards development and promotion Preservation and digitisation support R&D, technology platforms, tools development Legal services and advice on open access and I.P.R. Discover – Access – Deliver Authentication and authorisation,  Harvesting, aggregating, hosting User-friendly discovery and delivery Connect – Collaborate – Use Supporting communities of practice in digital humanities Facilitating innovative research practices Tools development and tools registries
Preparation Project: Overview of the Work Packages Project management Dissemination Strategic work Financial work Governance and logistical work Legal work Technical reference architecture Technical: Conceptual modelling
DARIAH preparation project partners
Aspiring Partners Italy Spain Austria Switzerland Other prospective partners in: Rumania,   Bulgaria ,  Hungary, Lithuania,   Sweden, Norway,  Serbia, FYROM
Strategic, financial, organisational and legal objectives Strategic: To determine the strategic vision, goals, objectives and policies for DARIAH, ensuring they are based upon and will meet stakeholder requirements, clearly identified business drivers, and identified benefits for European research Financial: To define a sustainable business model for DARIAH that allows for the provision of long-term services to the European research community in the humanities, while ensuring adaptability to new user needs and new technological developments Logistical: To deliver a business plan that describes the organisational set-up and  the management structure, the role of the institutions and persons involved (stakeholders, staff, experts, partners, expansion with new partners) Legal: To determine the rights and obligations of different types of DARIAH partners and allowing for the inclusion of new partners;   draft licence agreements,   products and services contracts ; ERI or non-ERI, that is the question.
Technical and conceptual objectives Architecture:  To draft the technical reference architecture of DARIAH, consisting of draft engineering plans, as well as demonstrators for key enabling technologies.  Conceptual:   Develop foundation of a coherent, interlinked, and collaboratively maintained virtual infrastructure of digital resources in the partner institutes. Model and evaluate the research processes in selected digital humanities disciplines.  
Arena Demonstrator Exemplary Service-Oriented Architecture: ARENA at ADS a Culture 2000 portal Added value of DARIAH: Migrate legacy applications Database integration: Integrate access
Fedora (manuscript) demonstrator The NFI collection - The manuscripts AM 366-371 fol. in the Arnamagnaean Collection contain drawings and descriptions of all the Danish and Norwegian runic monuments (stones with runic inscriptions) which were known in the 1620s
Arts and humanities disciplines according to  European Reference Index for the Humanities (ERIH)  and  in selected countries
Relations to other projects and networks
Preparing DARIAH: time schedule 2008 2009 May 2007 Deadline Capacities call  ESFRI projects Q3 2008 Agreement EC  funding Q4 2008 Start “Preparing DARIAH” 2010 2007 October 2006 Publication ESFRI  Roadmap  December 2006 Publication  relevant FP7 call Q1 2010  DARIAH  conference Q1 2011 Start construction DARIAH Financial Commitment?
Summary Mission:  to enhance the European research infrastructure in the humanities  linking (and upgrading) distributed digital resources and merging them into a grid-empowered architecture designing new facilities for pioneering research, preferably of an international and interdisciplinary nature  Structure : a single, distributed organisation that combines specialist knowledge of the fields with technology expertise in digital information and communication structures
Summary (continued) Organising principle:  a decentralised network; an international core in a cluster of national and thematic satellites The core will bear responsibility for organising and supporting the network, for the basic infrastructure, and for the method and means of communication. The national ‘hubs’ will bear responsibility for the specific thematic or disciplinary expertise. The hubs will be prominent institutes and research networks with a leading role within the European context. The model is an open one and will be able to embrace new, promising fields that are as yet unable to play such a leading role in Europe.
Additional information www.dariah.eu

DARIAH Athens May 2009

  • 1.
    Towards a newresearch infrastructure for the arts and humanities Peter Doorn Director, Data Archiving and Networked Services Athens May 6 th , 2009
  • 2.
    What do humanitiesscholars have to do with digital research infrastructures? Traditional image of the humanities scholar: a loner in his study
  • 3.
    Ideas about thefuture may be false... Mouse? How Rand Corporation envisioned the future (2004) home computer in 1954
  • 4.
    From Humanities computingto e-humanities Roots go back to the 1960s: text analysis, e.g. bible studies quantitative social and economic history linguistics archaeology E-humanities as analogy of e-science: ‘ science increasingly done through distributed global collaborations enabled by the Internet, using very large data collections, large-scale computing resources and high performance visualisation.’
  • 5.
  • 6.
    Aetolian Studies Project:settlement history from prehistory to modern times
  • 7.
    CLIWOC-project (Climate ofthe World Oceans) Collaboration of historians and meteorologists
  • 8.
    Journal entry, 26-29September 1758
  • 9.
    Shipping Routes 1750-1850Courtesy of CLIWOC project, KNMI
  • 10.
    Average yearly temperatures,1750-1850 Courtesy of CLIWOC project, KNMI
  • 11.
    Wind direction andspeed Courtesy of CLIWOC project, KNMI
  • 12.
    Wind directions andrainy days, Atlantic, 1770-1780 Courtesy of CLIWOC project, KNMI
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
    Virtual archive offinds, publications, data and documentation
  • 18.
    Electronic Depot NetherlandsArchaeology (EDNA)
  • 19.
  • 20.
    POLEMON Greek NationalMonuments Record System Main characteristics Geographical distribution – federated architecture Administrative documentation for site monuments and moveable objects Cartographic documentation (GIS component) Connection with thematic databases and term thesauri Operation modes : distributed, centralized, mixed Easy installation and startup By: Panos Constantopoulos Installation and deployment on a national scale ~ 65 installations in Greece ~ 120,000 monuments recorded Trained personnel in every ephorate .. ΥΠΠΟ/ΔΑΜΔ . .
  • 21.
    Digital Research datain the arts and humanities Digital materials collected for research purposes, e.g.: History: digitized archival sources such as population registers, shipping journals, historical censuses, judicial verdicts, medieval manuscripts Archaeology: excavation data - field reports, databases of finds, photos of objects, digital maps of sites, drawings of shards Linguistics: speech data, text corpora, video
  • 22.
    What does datalook like? Multitude of forms: data bases, spreadsheets, texts, audio, video, still images Multitude of formats: since 1960s! From home-grown applications (legacy data) via standard software to open standards Data is often coded or “enriched”: cannot be understood or used without ample documentation Often: difficult to use without specific software
  • 23.
    Research Infrastructures (R.I.)R.I. in general: permanent and physical R.I. for the natural sciences: ice breakers for polar research, satellites, telescopes, particle accelerators, laboratories R.I. for the humanities? Cultural heritage in all forms is the main source of humanities research Libraries and archives are the traditional “laboratories” for the humanities In the digital age, essential for innovative humanities research is: Access to digitised heritage data (data bases, text corpora, speech, image collections, etc.) Tools to process this information The most important new research infrastructure for the humanities is therefore a digital one
  • 24.
    European infrastructure challengesIn spite of some achievements, existing infrastructures are primarily national... if they are there at all! European activities are until now funded on a project basis and carried out as voluntary activities by national partners Stable, pan-European data infrastructures for the humanities hardly exist Increasing internationalisation of humanities research puts new requirements for such infrastructures
  • 25.
    ESFRI Roadmap ESFRI= European Strategy Forum on Research Infrastructures First Roadmap launched in autumn 2006, update in 2008 About 30 proposals for large scale research facilities Six proposals are in area of social sciences and humanities
  • 26.
    The DARIAH ideaThe Grand Vision: Provide Access to European humanities and cultural heritage information across time A Research Infrastructure that can Coordinate, Catalyse, Enhance, Support the digital humanities Digital research infrastructure for the humanities: Provide permanent access to data collected/digitised in European projects: providing continuity for discontinuous activities Support research networks in the digital humanities Structure: a strong nucleus in a cluster of networked organisations and satellites
  • 27.
    Science Case Changingresearch practice in a networked environment: Data (including text, images, and other media) is the laboratory of the scholar in the humanities Resources on the web are distributed (data grid) The scale of research goes up: networked projects New technologies and methods of analysis However, European projects have no continuity The existing structures are too weak (ad hoc networks, no permanence) and national in scope Answer: strong European data infrastructure
  • 28.
    A digital researchinfrastructure for the humanities is comparable to a virtual astrophysics observatory
  • 29.
  • 30.
  • 31.
    Outline of tasksof DARIAH Digitise – Curate – Preserve Standards development and promotion Preservation and digitisation support R&D, technology platforms, tools development Legal services and advice on open access and I.P.R. Discover – Access – Deliver Authentication and authorisation, Harvesting, aggregating, hosting User-friendly discovery and delivery Connect – Collaborate – Use Supporting communities of practice in digital humanities Facilitating innovative research practices Tools development and tools registries
  • 32.
    Preparation Project: Overviewof the Work Packages Project management Dissemination Strategic work Financial work Governance and logistical work Legal work Technical reference architecture Technical: Conceptual modelling
  • 33.
  • 34.
    Aspiring Partners ItalySpain Austria Switzerland Other prospective partners in: Rumania, Bulgaria , Hungary, Lithuania, Sweden, Norway, Serbia, FYROM
  • 35.
    Strategic, financial, organisationaland legal objectives Strategic: To determine the strategic vision, goals, objectives and policies for DARIAH, ensuring they are based upon and will meet stakeholder requirements, clearly identified business drivers, and identified benefits for European research Financial: To define a sustainable business model for DARIAH that allows for the provision of long-term services to the European research community in the humanities, while ensuring adaptability to new user needs and new technological developments Logistical: To deliver a business plan that describes the organisational set-up and the management structure, the role of the institutions and persons involved (stakeholders, staff, experts, partners, expansion with new partners) Legal: To determine the rights and obligations of different types of DARIAH partners and allowing for the inclusion of new partners; draft licence agreements, products and services contracts ; ERI or non-ERI, that is the question.
  • 36.
    Technical and conceptualobjectives Architecture: To draft the technical reference architecture of DARIAH, consisting of draft engineering plans, as well as demonstrators for key enabling technologies. Conceptual: Develop foundation of a coherent, interlinked, and collaboratively maintained virtual infrastructure of digital resources in the partner institutes. Model and evaluate the research processes in selected digital humanities disciplines.  
  • 37.
    Arena Demonstrator ExemplaryService-Oriented Architecture: ARENA at ADS a Culture 2000 portal Added value of DARIAH: Migrate legacy applications Database integration: Integrate access
  • 38.
    Fedora (manuscript) demonstratorThe NFI collection - The manuscripts AM 366-371 fol. in the Arnamagnaean Collection contain drawings and descriptions of all the Danish and Norwegian runic monuments (stones with runic inscriptions) which were known in the 1620s
  • 39.
    Arts and humanitiesdisciplines according to European Reference Index for the Humanities (ERIH) and in selected countries
  • 40.
    Relations to otherprojects and networks
  • 41.
    Preparing DARIAH: timeschedule 2008 2009 May 2007 Deadline Capacities call ESFRI projects Q3 2008 Agreement EC funding Q4 2008 Start “Preparing DARIAH” 2010 2007 October 2006 Publication ESFRI Roadmap December 2006 Publication relevant FP7 call Q1 2010 DARIAH conference Q1 2011 Start construction DARIAH Financial Commitment?
  • 42.
    Summary Mission: to enhance the European research infrastructure in the humanities linking (and upgrading) distributed digital resources and merging them into a grid-empowered architecture designing new facilities for pioneering research, preferably of an international and interdisciplinary nature Structure : a single, distributed organisation that combines specialist knowledge of the fields with technology expertise in digital information and communication structures
  • 43.
    Summary (continued) Organisingprinciple: a decentralised network; an international core in a cluster of national and thematic satellites The core will bear responsibility for organising and supporting the network, for the basic infrastructure, and for the method and means of communication. The national ‘hubs’ will bear responsibility for the specific thematic or disciplinary expertise. The hubs will be prominent institutes and research networks with a leading role within the European context. The model is an open one and will be able to embrace new, promising fields that are as yet unable to play such a leading role in Europe.
  • 44.