Data Management: Expose, Preserve, Protect Ian Moore Head, ICT ILRI & WorldAgroforestry Mini-workshop ILRIAPM2010
Objective of the Mini-Workshop Raise Awareness What’s being done? Who can help? Capture Ideas What’s important? How can we do more?
Agenda for the Mini-Workshop Principles Managing data – What scientists are doing Exposure – How KMIS can Help Preservation – How RMG can Help Protection – How ICT can Help
Information Lifecycle nnn   Create / Capture / Receipt Analyse / Manage / Share / Use Retain / Dispose / Preserve
Research Data Management and Archiving Policy Principles: Quality Efficiency Understandability Exposure Preservation Ethics
CGIAR Research Outputs AAA Framework: Availability Accessibility Applicability http://ictkm.cgiar.org/what-we-do/triple-a-framework/
Managing Data What are scientists doing?
Exposure How KMIS can Help
Preservation How RMG Help
Data Access Livestock: the good, the bad and the …
What is data access? Data access  typically refers to software and activities related to storing, retrieving, or acting on data housed in a database or other repository. Data access is done in two stages,  Physical data access – location, access rights Logical data access  - ability to understand the data which is facilitated by the  metadata
What is metadata? data about data… A structured description of characteristics such as the meaning, content, structure & purpose of a resource A gateway to resource discovery by enabling field-based searches Within the coding for a resource itself. In a database of descriptions/repository of resources Where is metadata found?
Examples of International Metadata Standards Dublin core Data Documentation Initiative, DDI IEEE Draft Standard for Learning Object Metadata SCORM - Sharable Content Object Reference Model  Other bodies involved: Prometeus & CEN/ISSS, ISO, BSI
Importance of metadata Provides a standardized system to classify and label web content Improves search relevancy Provides an audit trail Identification of  redundant, duplicative, and obsolete content Tracking of institution-wide assembled information
Metadata benefits –  to data users Enables searching, retrieval, and evaluation of data set information both within and outside organizations Finding data: determine which data exist for a specific area of interest Access and Transfer: acquisition, utilization and access rights
Metadata benefits –  to organizations Organize and maintain an organization's investment in data Documentation of data processing steps, quality control, definitions, data uses and restrictions, etc. Transcends people and time; offers data permanence and creates institutional memory Saves time, money, frustration 
The future … Web-based system – KnowledgeTree KnowledgeTree – a DMS that  manage the document lifecycle, promote collaboration and ensure compliance Why KnowledgeTree Customizable metadata capture Web-based access Role and group-based permissions RSS feeds and subscription integration Open-source – customizable Workflows integration Notifications and alerts Version control
Folder structure
Metadata capture Customizable metadata
Protection How ICT can Help
Would your data recover from any one of these?
Levels of Risk  nnn   Level 1 : incident management Equipment failure, data loss or corruption Level 2 : local relocation of ICT resources Fire, flood, or similar local disaster Level 3 : relocation within country Earthquake, major fire or flood Level 4 : relocation to a different country Civil unrest, major natural disaster
Backup Strategies Strategy 1  – Single copy on online storage  Protects risk level 1 Strategy 2  – Additional copy on offsite storage  Protects risk level 1 and 2 . Strategy 3  – Addition copy at a 3 rd  location out-of-country.  Protects risk level 1, 2 and 3 Strategy 4  – Multiple copies, frequency and retention period decided by the data owner  Protects risks 1, 2, 3 and 4
Thank you More Discussion

Data management: expose, preserve, protect

  • 1.
    Data Management: Expose,Preserve, Protect Ian Moore Head, ICT ILRI & WorldAgroforestry Mini-workshop ILRIAPM2010
  • 2.
    Objective of theMini-Workshop Raise Awareness What’s being done? Who can help? Capture Ideas What’s important? How can we do more?
  • 3.
    Agenda for theMini-Workshop Principles Managing data – What scientists are doing Exposure – How KMIS can Help Preservation – How RMG can Help Protection – How ICT can Help
  • 4.
    Information Lifecycle nnn Create / Capture / Receipt Analyse / Manage / Share / Use Retain / Dispose / Preserve
  • 5.
    Research Data Managementand Archiving Policy Principles: Quality Efficiency Understandability Exposure Preservation Ethics
  • 6.
    CGIAR Research OutputsAAA Framework: Availability Accessibility Applicability http://ictkm.cgiar.org/what-we-do/triple-a-framework/
  • 7.
    Managing Data Whatare scientists doing?
  • 8.
  • 9.
  • 10.
    Data Access Livestock:the good, the bad and the …
  • 11.
    What is dataaccess? Data access  typically refers to software and activities related to storing, retrieving, or acting on data housed in a database or other repository. Data access is done in two stages, Physical data access – location, access rights Logical data access - ability to understand the data which is facilitated by the metadata
  • 12.
    What is metadata?data about data… A structured description of characteristics such as the meaning, content, structure & purpose of a resource A gateway to resource discovery by enabling field-based searches Within the coding for a resource itself. In a database of descriptions/repository of resources Where is metadata found?
  • 13.
    Examples of InternationalMetadata Standards Dublin core Data Documentation Initiative, DDI IEEE Draft Standard for Learning Object Metadata SCORM - Sharable Content Object Reference Model Other bodies involved: Prometeus & CEN/ISSS, ISO, BSI
  • 14.
    Importance of metadataProvides a standardized system to classify and label web content Improves search relevancy Provides an audit trail Identification of redundant, duplicative, and obsolete content Tracking of institution-wide assembled information
  • 15.
    Metadata benefits – to data users Enables searching, retrieval, and evaluation of data set information both within and outside organizations Finding data: determine which data exist for a specific area of interest Access and Transfer: acquisition, utilization and access rights
  • 16.
    Metadata benefits – to organizations Organize and maintain an organization's investment in data Documentation of data processing steps, quality control, definitions, data uses and restrictions, etc. Transcends people and time; offers data permanence and creates institutional memory Saves time, money, frustration 
  • 17.
    The future …Web-based system – KnowledgeTree KnowledgeTree – a DMS that manage the document lifecycle, promote collaboration and ensure compliance Why KnowledgeTree Customizable metadata capture Web-based access Role and group-based permissions RSS feeds and subscription integration Open-source – customizable Workflows integration Notifications and alerts Version control
  • 18.
  • 19.
  • 20.
  • 21.
    Would your datarecover from any one of these?
  • 22.
    Levels of Risk nnn Level 1 : incident management Equipment failure, data loss or corruption Level 2 : local relocation of ICT resources Fire, flood, or similar local disaster Level 3 : relocation within country Earthquake, major fire or flood Level 4 : relocation to a different country Civil unrest, major natural disaster
  • 23.
    Backup Strategies Strategy1 – Single copy on online storage Protects risk level 1 Strategy 2 – Additional copy on offsite storage Protects risk level 1 and 2 . Strategy 3 – Addition copy at a 3 rd location out-of-country. Protects risk level 1, 2 and 3 Strategy 4 – Multiple copies, frequency and retention period decided by the data owner Protects risks 1, 2, 3 and 4
  • 24.
    Thank you MoreDiscussion

Editor's Notes

  • #5 Methodology for the effective management of information and data throughout its useful life
  • #6 Quality - Explicit data validation procedures are used and documented throughout the research process. Efficiency - Data management procedures are designed to improve the efficiency of the research process. Exposure - Research data are in the public domain unless restrictions will serve our mission better (as per IP policy). The key concern here is the balance the rights of the data originator (i.e. getting credit for what he/she has done) with duty of the institution (i.e. getting the most value from the data, no matter who collected them). As a general rule we allow researchers exclusive use of it for up to 18 months. Preservation - Data generated from research activities are secured in long-term archives. It is possible to find what data have been generated and archived. Potential users are able to retrieve data in a usable format, subject to restrictions in Principle 3. Understandability - Sufficient information is linked to the data for potential users to decide if it meets their requirements. Ethics -. Confidentiality of data on human subjects is respected in accordance with national data protection acts (where they exist) and well established international standards.
  • #7 The Triple-A Framework developed by the ICT-KM Program seeks to help CGIAR Centers/Programs and their scientists decide on the level of Availability, Accessibility and Applicability (AAA) they want for their research outputs, and also the pathways with which to turn these outputs into International Public Goods.
  • #8 Isabelle to talk about folder structure on Central Storage Bruno to talk about SLP, Google Apps and CGXchange
  • #22 Ask audience if they are ready for a disaster happening in their Centers – a disaster that would preclude them from using their critical IT systems support or even operating in their current location. It is a question that should guide every participant during the BCP workshop. The end objective of the workshop is really to increase awareness about the need of BCP among CGIAR key management staff.
  • #23 Level 1 : a relatively small disaster that will not involve a change in location but will require data or resources to be restored. This type of disaster is usually covered by ICT Services through the incident management procedures. Level 2 : a more substantial disaster that affects most of the services hosted in the server room. A decision will need to be taken whether to set up a temporary server room on campus or at the disaster recovery site at ICRAF. Level 3 : a significant disaster that prevents services being provided from the main campus. This will require temporary services to be restored at the disaster recovery site at ICRAF. Level 4 : a major disaster that will involve relocation of the Centre and its staff to an alternative location in a different country
  • #24 Strategy 1 – Low criticality to the operation of the Centre. Original data is stored on individual computers. A copy of the data is taken at scheduled times to local online storage. This protects against the risk 1. Strategy 2 – Medium criticality to the operation of the Centre. Original data is stored on individual computers, on servers or online locally and is usually owned by individuals, shared by groups, classed as Institutional information or application data. A copy of the data is replicated to the online remote storage. This protects against the risk 1 and 2. Strategy 3 – Medium criticality to the operation of the Centre. The original data is archived data that does not change and is stored online locally. A copy of the data can be made to transportable media and stored in a 3 rd location. This protects against risk 1, 2 and 3. Strategy 4 – High criticality to the operation of the Centre. The original data is stored on servers, online locally and is usually owned by individuals, shared by groups, classed as Institutional information or application data. One or more copies of the data are made online to the remote location. Copies of the data are also made to transportable media and stored in a 3 rd location. The frequency of the copies made and the length of time they will be kept for will be negotiated with the owner of the data. This protects against risks 1, 2, 3 and 4.